This course is opened by University of the West of England Undergraduate Double Degree Program in Business Administration in Chung hua University.
Business data analysis is an analytical field that integrates various statistical methods with (big) data. The field has evolved from traditional data analysis to big data analysis, eventually branching into big data analytics within computer science. This shift marks a transition from focusing on analytical methods to focusing on analytical tools and technologies.
However, for management professionals, the core of data analysis lies in the methods and results—specifically, the interpretation of those results to inform decision-making. Therefore, a "Business data analysis" course designed for management is distinct from one in an IT department. It focuses less on technical infrastructure and more on data analysis and the interpretation of findings.
As big data technology merges with Artificial Intelligence (AI), we see a divergence in results. While AI has achieved significant success in classifying labeled data (such as text, images, and video), its performance in structured data analysis is still evolving. For most businesses, numerical data is often more critical and objective than non-numerical data. Consequently, how to effectively introduce AI into numerical data analysis is a vital topic for both academia and industry.
Most numerical data accumulated by enterprises is time-series data. The time factor within this data contains valuable insights for decision-making. However, determining how to incorporate the influence of time into AI computations and judgment remains a challenge.
The Foundation: Regression Analysis
The most common method for analyzing time-series data is **regression analysis**. Traditional methods include linear regression (simple and multiple regression).
Premise: The principle of Least Squares (minimizing error) to ensure the highest accuracy.
Assumptions:
The conditional expectation follows a linear model.
Normal distribution of errors.
Homoscedasticity (constant variance).
Independence of observations.
The Mathematical Model: Y = E(Y|X) = β0 + β1 X + ε
Goal: Find the estimated values of β0 and β1.
Method: Based on the premise, we use Ordinary Least Squares (OLS) to create a sum of squared errors function, which is then differentiated to find the target. Alternatively, using the Maximum Likelihood Estimation (MLE) under the normal distribution assumption. In machine learning, this is referred to as a Loss Function, where the goal is to minimize the loss.
Hidden Assumptions and the Role of AI
In traditional analysis, there are "hidden" premises that often become default "hidden rules" in AI automation rather than being derived from the data itself. Usually, the analyst manually decides:
How many data points to include.
To import all data at once to generate a single regression line.
In practice, analysts decide which data range is "appropriate" based on their specific problem. While the resulting regression line satisfies the principle of minimum error, the data scope is still determined by human judgment.
If the data could "decide for itself" how many points should form a regression line, what would it choose? Would the data’s own selection be better, more objective, and more effective at revealing errors than human presets?
If we replace the analyst with AI, letting the AI represent the data to judge the optimal number of points for a regression line, the result might not be a single line. Instead, it could be multiple regression lines. This is because the AI will evaluate the data based strictly on the core premise: Which model is truly the most accurate (minimal error)!
The "AI Mathematical Modeling" software is an AI-powered educational tool specifically designed to analyze numerical data and identify mathematical patterns. By integrating this tool into the curriculum, users can leverage real-world data and computational results in the classroom to understand the underlying patterns and characteristics of corporate data.
This approach effectively addresses the challenges businesses face with the time factors inherent in numerical time-series data. Furthermore, it provides more robust and actionable results compared to traditional statistical regression analysis.
Foundational Knowledge: Develop a fundamental understanding of data analysis, big data analysis, and Artificial Intelligence (AI).
Trend Identification: Learn to apply statistical regression analysis to numerical data, identifying regression lines (also known as trendlines) where time serves as the independent variable.
Learning by Doing: Align with the core spirit of vocational training—"Learning by Doing." This course transforms traditional teaching—moving away from textbooks, chalkboards, and paper exams—toward practical operations that directly enhance professional competencies for the workplace.
Weekly Data Updates: Generate new trend charts through weekly data refreshes to establish long-term tracking habits.
Synthesis and Induction: By aggregating historical trend charts, students will observe and summarize patterns of change.
Skill Acquisition: Gain practical skills and knowledge in big data modeling, trend analysis, and data visualization through hands-on experience.
To achieve short-term trend visualization, students will share their generated data trend charts on social media platforms. These efforts will be aggregated to co-create a "Course Results Website" for long-term tracking. During this process, students will not only master software operations but also learn to identify trendlines within corporate financial ratios, aiding their understanding of financial performance shifts.
Each week, students will update the data by adding three new financial ratio indicators to generate the latest trend charts. This allows them to accumulate the results necessary for long-term tracking and helps them understand how to manage datasets with multiple variables.
Students will learn the critical step of data format transformation, a necessary requirement for ensuring data compatibility within AI automation workflows.
Students will gain insight into how software embedded with AI processes data. The educational software displays every computational step and the resulting visualizations, making the internal logic of AI transparent.
Every stage of the software’s computation generates data results that must be saved. Students will learn the systematic process of preserving these results within a database for future use.
Through the dedicated website, students can visualize and present the short-term trend fluctuations of various companies' financial ratios as part of a comprehensive, long-term tracking project.