This project is an interactive GUI application built with Tkinter for loading, viewing, sorting, and plotting CSV data. It also includes predictive modeling features using LSTM, ARIMA, and linear regression models to forecast future data trends.
- Load CSV Data: Easily load CSV files into the application.
- View Data: Display the loaded data in a new window.
- Sort Data: Sort the data based on a selected column.
- Plot Data: Visualize the data by plotting selected columns.
- Predictive Modeling: Forecast future data trends using LSTM, ARIMA, and linear regression models.
- Google Scholar Scraping: Retrieve publication data from Google Scholar using SerpAPI.
- Clone the repository:
git clone https://github.com/your-username/data-analysis-app.git
- Navigate to the project directory:
cd data-analysis-app
- Install the required dependencies:
pip install pandas matplotlib tensorflow scikit-learn statsmodels serpapi
- Run the main application script:
python main.py
- Use the GUI to load data, view data, sort data, and plot data.
- Run predictive modeling scripts to forecast future data trends:
python lstm_forecasting.py python arima_forecasting.py python regresja_liniowa.py
- Retrieve publication data from Google Scholar using the scraping script:
python scrapping.py
DataApp
: Main class for the application, handling UI creation, event binding, and core functionality.load_data()
: Function to load CSV data into the application.view_data()
: Function to display the loaded data in a new window.sort_data()
: Function to sort the data based on a selected column.plot_data()
: Function to plot the data based on a selected column.
- LSTM Model (
lstm_forecasting.py
):- Data preparation, model training, and future predictions using LSTM for both publication count and citation count.
- ARIMA Model (
arima_forecasting.py
):- Data preparation, model fitting, and future predictions using ARIMA for both publication count and citation count.
- Linear Regression Model (
regresja_liniowa.py
):- Data preparation, model training, and future predictions using linear regression for both publication count and citation count.
pobierz_dane_scholarly(nazwa_autora)
: Function to retrieve publication data from Google Scholar for a given author using the SerpAPI.main()
: Main function to read a list of authors, retrieve their publication data, and save it to a CSV file.