In this challenge, you take on the role of Chief Data Scientist for your city's school district. Using Python's Pandas library, you will analyze district-wide standardized test results to identify trends in school performance. The data includes math and reading scores, school information, and other relevant factors. This analysis will assist the school board in making strategic decisions about budgets and priorities.
- Total number of unique schools.
- Total number of students.
- Total school district budget.
- Average math and reading scores.
- Percentage of students passing math and reading.
- Percentage of students passing both subjects.
- School type and total student count.
- Per capita spending.
- Average test scores.
- Schools with passing rates of 70% or higher in math and reading.
- Overall passing percentages for each school.
- Top-Performing Schools: Ranked by overall passing percentage.
- Lowest-Performing Schools: Ranked by overall passing percentage.
- Math and Reading Scores by Grade: Comparison of test scores by grade level.
- Scores by School Spending: Analysis of school performance based on spending levels.
- Scores by School Size: Comparison of school performance based on school size.
- Scores by School Type: Analysis of performance based on charter and district school types.
- Schools with lower budgets performed better in both math and reading, showing higher passing rates.
- Smaller and medium-sized schools consistently outperformed larger schools.
- Charter schools outperformed District schools, with all top-performing schools being Charter and all lowest-performing schools being District.
- Academic performance was generally consistent across grade levels.
- Students performed better in reading than math, though overall passing rates were low, indicating subject performance variation.
- Python: Scripting language for data analysis.
- Pandas: Library used for data manipulation and aggregation.
- Jupyter Notebook: Interactive development environment for running Python scripts.
- Clone this repository.
- Ensure Python and Pandas are installed in your environment.
- Run the analysis scripts in Jupyter Notebook or a Python environment to view results.
The written report attached provides a detailed summary of the analysis and draws conclusions based on key calculations. For a more detailed breakdown of the data, please refer to the attached analysis report in the repository.
- Sakina Jaffri - Data analysis, Pandas scripting, and reporting.