Skip to main content
Backend Developer2020
#Python#Django#PostgreSQL#Redis#Celery#React#Docker

Digital Tutor

Analytics platform for university management with tools for assessing student learning outcomes and recommendations for optimizing educational processes

About the Project

Digital Tutor is an analytics platform for the management of Ural Federal University (UrFU), designed for monitoring and analyzing the educational process. The system collects data on student test and exam results, analyzes learning patterns, and provides university leadership with tools for making informed decisions to improve the educational process.

The project solved the problem of lacking a centralized analytics system for the educational process: university management found it difficult to assess the effectiveness of academic programs, compare results between faculties, and track learning dynamics year over year.

My Role: Backend developer in a team of three groups — backend, ML team (analytics and recommendations), and design/frontend. Responsible for server-side development, integration with the ML analytics service, working with student data and test results.

Key Features

Data Collection and Aggregation

Centralized system for collecting learning outcomes:

  • Import of test and exam results from various sources
  • Data normalization from different faculties and departments
  • Storage of historical data for year-over-year trend analysis
  • API for integration with existing university systems

Learning Analytics

Tools for assessing educational process effectiveness:

  • Visualization of learning statistics by courses and topics
  • Comparison of results between faculties and groups
  • Year-over-year trend analysis
  • Identification of problematic topics with low performance
  • Comparison of different teaching methodologies by results

ML Recommendations for Management

Machine learning-based recommendation system:

  • Automatic detection of underperformance patterns
  • Recommendations for adjusting academic program difficulty
  • Suggestions for redistributing study hours between topics
  • Forecasting problematic areas in future semesters
  • Assessment of curriculum changes impact on student outcomes

Management Dashboards

Interactive panels for decision-making:

  • Summary dashboards with key university-wide metrics
  • Drill-down by faculties, departments, courses
  • Custom report and chart building
  • Data export for presentations and reporting
  • Filter system by periods, student groups, instructors

Architectural Solutions

PythonDjangoPostgreSQLRedisCeleryReactDockerNginx

Backend

Django-based architecture:

  • Import and normalization of data from various university sources
  • Storage of historical student performance data
  • Aggregation and calculation of statistical metrics
  • Integration with external ML service for analytics and recommendations

Technology stack:

  • Django REST Framework for API
  • PostgreSQL for storing large volumes of historical data
  • Redis for caching heavy analytical queries
  • Celery for asynchronous processing (data import, report generation)

Frontend

React application with responsive interface (developed by separate team):

  • Control panels for university management
  • Interactive dashboards with charts and diagrams
  • Customizable reports with filters by faculties, years, courses
  • Visualization of trend dynamics and comparative analysis

ML Service Integration

Working with analytics engine:

  • Integration with external ML service for performance pattern analysis
  • API for requesting recommendations on curriculum optimization
  • Caching analytics results for fast access to frequent queries
  • Processing and formatting recommendations for dashboard display

Technical Challenges and Solutions

Processing Large Data Volumes

Problem: Need to store and analyze results of 10000+ students over several years of study with fast report generation capability.

Solution:

  • Denormalization of frequently queried aggregates to speed up queries
  • PostgreSQL table partitioning by academic years
  • Building indexes for typical analytical queries (grouping by faculties, courses, periods)
  • Asynchronous pre-calculation of popular reports via Celery

Integrating Heterogeneous Data Sources

Problem: Student performance data came from various faculty systems in different formats (Excel, CSV, APIs of various LMS).

Solution:

  • Development of adapter set for importing data from different sources
  • Data validation and normalization before saving to database
  • Import error logging system for manual verification of problematic records
  • API for semi-automated import with preview and confirmation

Analytical Query Performance

Problem: Complex analytical queries with grouping and aggregation over millions of records led to timeouts.

Solution:

  • Materialized views for heavy aggregations
  • Aggressive caching in Redis with TTL based on data update frequency
  • Background cache rebuild after importing new data
  • Query optimization using EXPLAIN ANALYZE to identify bottlenecks

Results

10000+
Students in database
5+
Faculties
50+
Academic courses
3+
Years of history

Impact on University Management

  • Transparency: Management received a unified system for assessing educational process effectiveness instead of scattered faculty reports
  • Data-Driven Decisions: ML recommendations helped identify problematic courses and topics requiring curriculum revision
  • Comparative Analysis: Ability to compare results between faculties and years became the foundation for best practices
  • Resource Optimization: System helped determine where more study hours or teaching resources were needed

Technical Metrics

  • Data Volume: Over 10 million records of test and exam results
  • Performance: Building complex reports with multi-year aggregation took less than 5 seconds thanks to caching
  • Reliability: 99.5% uptime throughout the academic year
  • Import Speed: Processing data for 1000+ students in a single import took less than 10 minutes

Data Security

  • Authentication: JWT tokens with refresh mechanism for management staff
  • Authorization: RBAC (Role-Based Access Control) with granular access rights to data from different faculties
  • Encryption: TLS for all connections, protection of student personal data in database
  • Compliance: Personal data protection in accordance with Russian legislation requirements

Lessons Learned

What Worked Well

Materialized Views and Caching: The combination of materialized views in PostgreSQL and Redis caching achieved high performance even for complex analytical queries over millions of records.

Table Partitioning: Dividing data by academic years significantly accelerated queries with time filtering and simplified archiving of old data.

Data Import Adapters: Modular adapter system allowed quickly adding support for new data sources without changing core logic.

Challenges

Integrating Heterogeneous Data: Each faculty had its own performance tracking system with different structures. Had to create a flexible field mapping and data validation system with detailed logging for troubleshooting.

Performance Optimization: Initial versions of analytical queries were very slow. Required deep study of EXPLAIN ANALYZE, building proper indexes, and transitioning to denormalized structures for aggregates.

Working with Personal Data: The need to protect student personal data required special attention to security. Implemented encryption of sensitive fields, data access logging, and granular permission system.

Project Outcome

The project was successfully launched at Ural Federal University in 2020 and was actively used by management to analyze the educational process. The system processed data from over 10,000 students across 5+ faculties in 50+ academic courses, providing management with tools for making informed decisions about optimizing educational programs.

Unfortunately, the project was closed after the pilot period due to university priority changes. Nevertheless, the experience of working with large data volumes, building analytics systems, and optimizing PostgreSQL performance proved extremely valuable for further career in developing high-load backend systems.