mirror of
https://github.com/Xevion/banner.git
synced 2025-12-05 21:14:19 -06:00
docs: setup proper documentation, organize & clean README
This commit is contained in:
4
.gitignore
vendored
4
.gitignore
vendored
@@ -2,6 +2,4 @@
|
||||
/target
|
||||
/go/
|
||||
.cargo/config.toml
|
||||
|
||||
**/*.md
|
||||
!/README.md
|
||||
src/scraper/README.md
|
||||
27
Justfile
27
Justfile
@@ -1,21 +1,28 @@
|
||||
default_services := "bot,web,scraper"
|
||||
|
||||
# Auto-reloading frontend server
|
||||
frontend:
|
||||
pnpm run -C web dev
|
||||
|
||||
backend:
|
||||
cargo run --bin banner
|
||||
|
||||
# Production build of frontend
|
||||
build-frontend:
|
||||
pnpm run -C web build
|
||||
|
||||
build-backend:
|
||||
cargo build --release --bin banner
|
||||
# Auto-reloading backend server
|
||||
backend services=default_services:
|
||||
bacon --headless run -- -- --services "{{services}}"
|
||||
|
||||
build: build-frontend build-backend
|
||||
|
||||
# Production build that embeds assets
|
||||
build-prod:
|
||||
# Production build
|
||||
build:
|
||||
pnpm run -C web build
|
||||
cargo build --release --bin banner
|
||||
|
||||
# Run auto-reloading development build with release characteristics (frontend is embedded, non-auto-reloading)
|
||||
# This is useful for testing backend release-mode details.
|
||||
dev-build services=default_services: build-frontend
|
||||
bacon --headless run -- --profile dev-release -- --services "{{services}}" --tracing pretty
|
||||
|
||||
# Auto-reloading development build for both frontend and backend
|
||||
# Will not notice if either the frontend/backend crashes, but will generally be resistant to stopping on their own.
|
||||
[parallel]
|
||||
dev: frontend backend
|
||||
dev services=default_services: frontend (backend services)
|
||||
142
README.md
142
README.md
@@ -1,125 +1,51 @@
|
||||
# banner
|
||||
|
||||
A discord bot for executing queries & searches on the Ellucian Banner instance hosting all of UTSA's class data.
|
||||
A complex multi-service system providing a Discord bot and browser-based interface to UTSA's course data.
|
||||
|
||||
## Feature Wishlist
|
||||
## Services
|
||||
|
||||
- Commands
|
||||
- ICS Download (get a ICS download of your classes with location & timing perfectly - set for every class you're in)
|
||||
- Classes Now (find classes happening)
|
||||
- Autocomplete
|
||||
- Class Title
|
||||
- Course Number
|
||||
- Term/Part of Term
|
||||
- Professor
|
||||
- Attribute
|
||||
- Component Pagination
|
||||
- RateMyProfessor Integration (Linked/Embedded)
|
||||
- Smart term selection (i.e. Summer 2024 will be selected automatically when opened)
|
||||
- Rate Limiting (bursting with global/user limits)
|
||||
- DMs Integration (allow usage of the bot in DMs)
|
||||
- Class Change Notifications (get notified when details about a class change)
|
||||
- Multi-term Querying (currently the backend for searching is kinda weird)
|
||||
- Full Autocomplete for Every Search Option
|
||||
- Metrics, Log Query, Privileged Error Feedback
|
||||
- Search for Classes
|
||||
- Major, Professor, Location, Name, Time of Day
|
||||
- Subscribe to Classes
|
||||
- Availability (seat, pre-seat)
|
||||
- Waitlist Movement
|
||||
- Detail Changes (meta, time, location, seats, professor)
|
||||
- `time` Start, End, Days of Week
|
||||
- `seats` Any change in seat/waitlist data
|
||||
- `meta`
|
||||
- Lookup via Course Reference Number (CRN)
|
||||
- Smart Time of Day Handling
|
||||
- "2 PM" -> Start within 2:00 PM to 2:59 PM
|
||||
- "2-3 PM" -> Start within 2:00 PM to 3:59 PM
|
||||
- "ends by 2 PM" -> Ends within 12:00 AM to 2:00 PM
|
||||
- "after 2 PM" -> Start within 2:01 PM to 11:59 PM
|
||||
- "before 2 PM" -> Ends within 12:00 AM to 1:59 PM
|
||||
- Get By Section Command
|
||||
- CS 4393 001 =>
|
||||
- Will require SQL to be able to search for a class by its section number
|
||||
The application consists of three modular services that can be run independently or together:
|
||||
|
||||
## Analysis Required
|
||||
- Discord Bot ([`bot`][src-bot])
|
||||
|
||||
Some of the features and architecture of Ellucian's Banner system are not clear.
|
||||
The follow features, JSON, and more require validation & analysis:
|
||||
- Primary interface for course monitoring and data queries
|
||||
- Built with [Serenity][serenity] and [Poise][poise] frameworks for robust command handling
|
||||
- Uses slash commands with comprehensive error handling and logging
|
||||
|
||||
- Struct Nullability
|
||||
- Much of the responses provided by Ellucian contain nulls, and most of them are uncertain as to when and why they're null.
|
||||
- Analysis must be conducted to be sure of when to use a string and when it should nillable (pointer).
|
||||
- Multiple Professors / Primary Indicator
|
||||
- Multiple Meeting Times
|
||||
- Meeting Schedule Types
|
||||
- AFF vs AIN vs AHB etc.
|
||||
- Do CRNs repeat between years?
|
||||
- Check whether partOfTerm is always filled in, and it's meaning for various class results.
|
||||
- Check which API calls are affected by change in term/sessionID term select
|
||||
- SessionIDs
|
||||
- How long does a session ID work?
|
||||
- Do I really require a separate one per term?
|
||||
- How many can I activate, are there any restrictions?
|
||||
- How should session IDs be checked as 'invalid'?
|
||||
- What action(s) keep a session ID 'active', if any?
|
||||
- Are there any courses with multiple meeting times?
|
||||
- Google Calendar link generation, as an alternative to ICS file generation
|
||||
- Web Server ([`web`][src-web])
|
||||
|
||||
## Change Identification
|
||||
- [Axum][axum]-based server with Vite/React-based frontend
|
||||
- [Embeds static assets][rust-embed] at compile time with E-Tags & Cache-Control headers
|
||||
|
||||
- Important attributes of a class will be parsed on both the old and new data.
|
||||
- These attributes will be compared and given identifiers that can be subscribed to.
|
||||
- When a user subscribes to one of these identifiers, any changes identified will be sent to the user.
|
||||
- Scraper ([`scraper`][src-scraper])
|
||||
|
||||
## Real-time Suggestions
|
||||
- Intelligent data collection system with priority-based queuing inside PostgreSQL via [`sqlx`][sqlx]
|
||||
- Rate-limited scraping with burst handling to respect UTSA's systems
|
||||
- Handles course data updates, availability changes, and metadata synchronization
|
||||
|
||||
Various commands arguments have the ability to have suggestions appear.
|
||||
## Quick Start
|
||||
|
||||
- They must be fast. As ephemeral suggestions that are only relevant for seconds or less, they need to be delivered in less than a second.
|
||||
- They need to be easy to acquire. With as many commands & arguments to search as I do, it is paramount that the API be easy to understand & use.
|
||||
- It cannot be complicated. I only have so much time to develop this.
|
||||
- It does not need to be persistent. Since the data is scraped and rolled periodically from the Banner system, the data used will be deleted and re-requested occasionally.
|
||||
```bash
|
||||
pnpm install -C web # Install frontend dependencies
|
||||
cargo build # Build the backend
|
||||
|
||||
For these reasons, I believe SQLite to be the ideal place for this data to be stored.
|
||||
It is exceptionally fast, works well in-memory, and is less complicated compared to most other solutions.
|
||||
just dev # Runs auto-reloading dev build
|
||||
just dev bot,web # Runs auto-reloading dev build, running only the bot and web services
|
||||
just dev-build # Development build with release characteristics (frontend is embedded, non-auto-reloading)
|
||||
|
||||
- Only required data about the class will be stored, along with the JSON-encoded string.
|
||||
- For now, this would only be the CRN (and possibly the Term).
|
||||
- Potentially, a binary encoding could be used for performance, but it is unlikely to be better.
|
||||
- Database dumping into R2 would be good to ensure that over-scraping of the Banner system does not occur.
|
||||
- Upon a safe close requested
|
||||
- Must be done quickly (<8 seconds)
|
||||
- Every 30 minutes, if any scraping ocurred.
|
||||
- May cause locking of commands.
|
||||
just build # Production build that embeds assets
|
||||
```
|
||||
|
||||
## Scraping
|
||||
## Documentation
|
||||
|
||||
In order to keep the in-memory database of the bot up-to-date with the Banner system, the API must be scraped.
|
||||
Scraping will be separated by major to allow for priority majors (namely, Computer Science) to be scraped more often compared to others.
|
||||
This will lower the overall load on the Banner system while ensuring that data presented by the app is still relevant.
|
||||
Comprehensive documentation is available in the [`docs/`][documentation] folder.
|
||||
|
||||
For now, all majors will be scraped fully every 4 hours with at least 5 minutes between each one.
|
||||
|
||||
- On startup, priority majors will be scraped first (if required).
|
||||
- Other majors will be scraped in arbitrary order (if required).
|
||||
- Scrape timing will be stored in Redis.
|
||||
- CRNs will be the Primary Key within SQLite
|
||||
- If CRNs are duplicated between terms, then the primary key will be (CRN, Term)
|
||||
|
||||
Considerations
|
||||
|
||||
- Change in metadata should decrease the interval
|
||||
- The number of courses scraped should change the interval (2 hours per 500 courses involved)
|
||||
|
||||
## Rate Limiting, Costs & Bursting
|
||||
|
||||
Ideally, this application would implement dynamic rate limiting to ensure overload on the server does not occur.
|
||||
Better, it would also ensure that priority requests (commands) are dispatched faster than background processes (scraping), while making sure different requests are weighted differently.
|
||||
For example, a recent scrape of 350 classes should be weighted 5x more than a search for 8 classes by a user.
|
||||
Still, even if the cap does not normally allow for this request to be processed immediately, the small user search should proceed with a small bursting cap.
|
||||
|
||||
The requirements to this hypothetical system would be:
|
||||
|
||||
- Conditional Bursting: background processes or other requests deemed "low priority" are not allowed to use bursting.
|
||||
- Arbitrary Costs: rate limiting is considered in the form of the request size/speed more or less, such that small simple requests can be made more frequently, unlike large requests.
|
||||
[documentation]: docs/README.md
|
||||
[src-bot]: src/bot
|
||||
[src-web]: src/web
|
||||
[src-scraper]: src/scraper
|
||||
[serenity]: https://github.com/serenity-rs/serenity
|
||||
[poise]: https://github.com/serenity-rs/poise
|
||||
[axum]: https://github.com/tokio-rs/axum
|
||||
[rust-embed]: https://lib.rs/crates/rust-embed
|
||||
[sqlx]: https://github.com/launchbadge/sqlx
|
||||
|
||||
94
docs/ARCHITECTURE.md
Normal file
94
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Architecture
|
||||
|
||||
## System Overview
|
||||
|
||||
The Banner project is built as a multi-service application with the following components:
|
||||
|
||||
- **Discord Bot Service**: Handles Discord interactions and commands
|
||||
- **Web Service**: Serves the React frontend and provides API endpoints
|
||||
- **Scraper Service**: Background data collection and synchronization
|
||||
- **Database Layer**: PostgreSQL for persistent storage
|
||||
|
||||
## Technical Analysis
|
||||
|
||||
### Banner System Integration
|
||||
|
||||
Some of the features and architecture of Ellucian's Banner system are not clear.
|
||||
The following features, JSON, and more require validation & analysis:
|
||||
|
||||
- Struct Nullability
|
||||
- Much of the responses provided by Ellucian contain nulls, and most of them are uncertain as to when and why they're null.
|
||||
- Analysis must be conducted to be sure of when to use a string and when it should nillable (pointer).
|
||||
- Multiple Professors / Primary Indicator
|
||||
- Multiple Meeting Times
|
||||
- Meeting Schedule Types
|
||||
- AFF vs AIN vs AHB etc.
|
||||
- Do CRNs repeat between years?
|
||||
- Check whether partOfTerm is always filled in, and it's meaning for various class results.
|
||||
- Check which API calls are affected by change in term/sessionID term select
|
||||
- SessionIDs
|
||||
- How long does a session ID work?
|
||||
- Do I really require a separate one per term?
|
||||
- How many can I activate, are there any restrictions?
|
||||
- How should session IDs be checked as 'invalid'?
|
||||
- What action(s) keep a session ID 'active', if any?
|
||||
- Are there any courses with multiple meeting times?
|
||||
- Google Calendar link generation, as an alternative to ICS file generation
|
||||
|
||||
## Change Identification
|
||||
|
||||
- Important attributes of a class will be parsed on both the old and new data.
|
||||
- These attributes will be compared and given identifiers that can be subscribed to.
|
||||
- When a user subscribes to one of these identifiers, any changes identified will be sent to the user.
|
||||
|
||||
## Real-time Suggestions
|
||||
|
||||
Various commands arguments have the ability to have suggestions appear.
|
||||
|
||||
- They must be fast. As ephemeral suggestions that are only relevant for seconds or less, they need to be delivered in less than a second.
|
||||
- They need to be easy to acquire. With as many commands & arguments to search as I do, it is paramount that the API be easy to understand & use.
|
||||
- It cannot be complicated. I only have so much time to develop this.
|
||||
- It does not need to be persistent. Since the data is scraped and rolled periodically from the Banner system, the data used will be deleted and re-requested occasionally.
|
||||
|
||||
For these reasons, I believe PostgreSQL to be the ideal place for this data to be stored.
|
||||
It is exceptionally fast, works well in-memory, and is less complicated compared to most other solutions.
|
||||
|
||||
- Only required data about the class will be stored, along with the JSON-encoded string.
|
||||
- For now, this would only be the CRN (and possibly the Term).
|
||||
- Potentially, a binary encoding could be used for performance, but it is unlikely to be better.
|
||||
- Database dumping into R2 would be good to ensure that over-scraping of the Banner system does not occur.
|
||||
- Upon a safe close requested
|
||||
- Must be done quickly (<8 seconds)
|
||||
- Every 30 minutes, if any scraping ocurred.
|
||||
- May cause locking of commands.
|
||||
|
||||
## Scraping System
|
||||
|
||||
In order to keep the in-memory database of the bot up-to-date with the Banner system, the API must be scraped.
|
||||
Scraping will be separated by major to allow for priority majors (namely, Computer Science) to be scraped more often compared to others.
|
||||
This will lower the overall load on the Banner system while ensuring that data presented by the app is still relevant.
|
||||
|
||||
For now, all majors will be scraped fully every 4 hours with at least 5 minutes between each one.
|
||||
|
||||
- On startup, priority majors will be scraped first (if required).
|
||||
- Other majors will be scraped in arbitrary order (if required).
|
||||
- Scrape timing will be stored in database.
|
||||
- CRNs will be the Primary Key within database
|
||||
- If CRNs are duplicated between terms, then the primary key will be (CRN, Term)
|
||||
|
||||
Considerations
|
||||
|
||||
- Change in metadata should decrease the interval
|
||||
- The number of courses scraped should change the interval (2 hours per 500 courses involved)
|
||||
|
||||
## Rate Limiting, Costs & Bursting
|
||||
|
||||
Ideally, this application would implement dynamic rate limiting to ensure overload on the server does not occur.
|
||||
Better, it would also ensure that priority requests (commands) are dispatched faster than background processes (scraping), while making sure different requests are weighted differently.
|
||||
For example, a recent scrape of 350 classes should be weighted 5x more than a search for 8 classes by a user.
|
||||
Still, even if the cap does not normally allow for this request to be processed immediately, the small user search should proceed with a small bursting cap.
|
||||
|
||||
The requirements to this hypothetical system would be:
|
||||
|
||||
- Conditional Bursting: background processes or other requests deemed "low priority" are not allowed to use bursting.
|
||||
- Arbitrary Costs: rate limiting is considered in the form of the request size/speed more or less, such that small simple requests can be made more frequently, unlike large requests.
|
||||
@@ -1,11 +1,17 @@
|
||||
# Sessions
|
||||
# Banner
|
||||
|
||||
All notes on the internal workings of the Banner system by Ellucian.
|
||||
|
||||
## Sessions
|
||||
|
||||
All notes on the internal workings of Sessions in the Banner system.
|
||||
|
||||
- Sessions are generated on demand with a random string of characters.
|
||||
- The format `{5 random characters}{milliseconds since epoch}`
|
||||
- Example: ``
|
||||
- Sessions are invalidated after 30 minutes, but may change.
|
||||
- This delay can be found in the original HTML returned, find `meta[name="maxInactiveInterval"]` and read the `content` attribute.
|
||||
- This is read at runtime by the javascript on initialization.
|
||||
- This is read at runtime (in the browser, by javascript) on initialization.
|
||||
- Multiple timers exist, one is for the Inactivity Timer.
|
||||
- A dialog will appear asking the user to continue their session.
|
||||
- If they click the button, the session will be extended via the keepAliveURL (see `meta[name="keepAliveURL"]`).
|
||||
58
docs/FEATURES.md
Normal file
58
docs/FEATURES.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# Features
|
||||
|
||||
## Current Features
|
||||
|
||||
### Discord Bot Commands
|
||||
|
||||
- **search** - Search for courses with various filters (title, course code, keywords)
|
||||
- **terms** - List available terms or search for a specific term
|
||||
- **time** - Get meeting times for a specific course (CRN)
|
||||
- **ics** - Generate ICS calendar file for a course with holiday exclusions
|
||||
- **gcal** - Generate Google Calendar link for a course
|
||||
|
||||
### Data Pipeline
|
||||
|
||||
- Intelligent scraping system with priority queues
|
||||
- Rate limiting and burst handling
|
||||
- Background data synchronization
|
||||
|
||||
## Feature Wishlist
|
||||
|
||||
### Commands
|
||||
|
||||
- ICS Download (get a ICS download of your classes with location & timing perfectly - set for every class you're in)
|
||||
- Classes Now (find classes happening)
|
||||
- Autocomplete
|
||||
- Class Title
|
||||
- Course Number
|
||||
- Term/Part of Term
|
||||
- Professor
|
||||
- Attribute
|
||||
- Component Pagination
|
||||
- RateMyProfessor Integration (Linked/Embedded)
|
||||
- Smart term selection (i.e. Summer 2024 will be selected automatically when opened)
|
||||
- Rate Limiting (bursting with global/user limits)
|
||||
- DMs Integration (allow usage of the bot in DMs)
|
||||
- Class Change Notifications (get notified when details about a class change)
|
||||
- Multi-term Querying (currently the backend for searching is kinda weird)
|
||||
- Full Autocomplete for Every Search Option
|
||||
- Metrics, Log Query, Privileged Error Feedback
|
||||
- Search for Classes
|
||||
- Major, Professor, Location, Name, Time of Day
|
||||
- Subscribe to Classes
|
||||
- Availability (seat, pre-seat)
|
||||
- Waitlist Movement
|
||||
- Detail Changes (meta, time, location, seats, professor)
|
||||
- `time` Start, End, Days of Week
|
||||
- `seats` Any change in seat/waitlist data
|
||||
- `meta`
|
||||
- Lookup via Course Reference Number (CRN)
|
||||
- Smart Time of Day Handling
|
||||
- "2 PM" -> Start within 2:00 PM to 2:59 PM
|
||||
- "2-3 PM" -> Start within 2:00 PM to 3:59 PM
|
||||
- "ends by 2 PM" -> Ends within 12:00 AM to 2:00 PM
|
||||
- "after 2 PM" -> Start within 2:01 PM to 11:59 PM
|
||||
- "before 2 PM" -> Ends within 12:00 AM to 1:59 PM
|
||||
- Get By Section Command
|
||||
- CS 4393 001 =>
|
||||
- Will require SQL to be able to search for a class by its section number
|
||||
42
docs/README.md
Normal file
42
docs/README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Documentation
|
||||
|
||||
This folder contains detailed documentation for the Banner project. This file acts as the index.
|
||||
|
||||
## Files
|
||||
|
||||
- [`FEATURES.md`](FEATURES.md) - Current features, implemented functionality, and future roadmap
|
||||
- [`BANNER.md`](BANNER.md) - General API documentation on the Banner system
|
||||
- [`ARCHITECTURE.md`](ARCHITECTURE.md) - Technical implementation details, system design, and analysis
|
||||
|
||||
## Samples
|
||||
|
||||
The `samples/` folder contains real Banner API response examples:
|
||||
|
||||
- `search/` - Course search API responses with various filters
|
||||
- [`searchResults.json`](samples/search/searchResults.json)
|
||||
- [`searchResults_500.json`](samples/search/searchResults_500.json)
|
||||
- [`searchResults_CS500.json`](samples/search/searchResults_CS500.json)
|
||||
- [`searchResults_malware.json`](samples/search/searchResults_malware.json)
|
||||
- `meta/` - Metadata API responses (terms, subjects, instructors, etc.)
|
||||
- [`get_attribute.json`](samples/meta/get_attribute.json)
|
||||
- [`get_campus.json`](samples/meta/get_campus.json)
|
||||
- [`get_instructionalMethod.json`](samples/meta/get_instructionalMethod.json)
|
||||
- [`get_instructor.json`](samples/meta/get_instructor.json)
|
||||
- [`get_partOfTerm.json`](samples/meta/get_partOfTerm.json)
|
||||
- [`get_subject.json`](samples/meta/get_subject.json)
|
||||
- [`getTerms.json`](samples/meta/getTerms.json)
|
||||
- `course/` - Course detail API responses (HTML and JSON)
|
||||
- [`getFacultyMeetingTimes.json`](samples/course/getFacultyMeetingTimes.json)
|
||||
- [`getClassDetails.html`](samples/course/getClassDetails.html)
|
||||
- [`getCorequisites.html`](samples/course/getCorequisites.html)
|
||||
- [`getCourseDescription.html`](samples/course/getCourseDescription.html)
|
||||
- [`getEnrollmentInfo.html`](samples/course/getEnrollmentInfo.html)
|
||||
- [`getFees.html`](samples/course/getFees.html)
|
||||
- [`getLinkedSections.html`](samples/course/getLinkedSections.html)
|
||||
- [`getRestrictions.html`](samples/course/getRestrictions.html)
|
||||
- [`getSectionAttributes.html`](samples/course/getSectionAttributes.html)
|
||||
- [`getSectionBookstoreDetails.html`](samples/course/getSectionBookstoreDetails.html)
|
||||
- [`getSectionPrerequisites.html`](samples/course/getSectionPrerequisites.html)
|
||||
- [`getXlistSections.html`](samples/course/getXlistSections.html)
|
||||
|
||||
These samples are used for development, testing, and understanding the Banner API structure.
|
||||
Reference in New Issue
Block a user