The Total score is 100 points. A submission must score 80 points to be considered for a cash prize. The Highest score above the minimum required score wins first place.
Data Collection and Quality [30]
Source Reliability: Data is sourced from reliable repositories (e.g., GitHub).
Completeness: All required data fields (cities/areas, languages, projects, tags, frameworks, databases, CI/CD) are fully populated.
Accuracy: Ensure data correctness, with no duplicates or errors.
Exclusion Criteria: Projects with less than 2 contributors, no active commits in the last year, or fewer than 3 stars are excluded.
Visualization Quality [20]
Graphs and Visuals: All data is represented through clear and informative graphs, meeting the deliverables requirement.
Visual Clarity: Visualizations are easy to understand, with appropriate labels, legends, and scales.
Notebook Organization [20]
Structured Notebooks: Separate, well-structured notebooks for each category of statistics as per the deliverables.
Code Quality: Code in the notebooks is clean, well-commented, and reproducible.
Tooling and Automation [20]
Use of Python and Tools: The notebooks effectively use Python and relevant libraries for data analysis and visualization.
Automation of Data Collection: Data collection and processing are automated where possible to ensure repeatability and reduce manual errors.
Reporting and Documentation [10]
Documentation: Notebooks are well-documented, with explanations of methodologies, data sources, and any assumptions made.
Transparency: All processes are transparent, with clear steps for data collection, analysis, and visualization.