npsm 새물리 New Physics : Sae Mulli

pISSN 0374-4914 eISSN 2289-0041
Qrcode

Article

Research Paper

New Phys.: Sae Mulli 2024; 74: 678-687

Published online July 31, 2024 https://doi.org/10.3938/NPSM.74.678

Copyright © New Physics: Sae Mulli.

Physics Education and Symbolic Regression

Eunhye Shin1, Jinseop Jang2, Junghyo Jo2*

1Bucheon Technical High School, Bucheon 14733, Korea
2Department of Physics Education, Seoul National University, Seoul 08826, Korea

Correspondence to:*jojunghyo@snu.ac.kr

Received: April 2, 2024; Revised: May 29, 2024; Accepted: June 3, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License(http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

This study explores the use of symbolic regression (SR) in physics education, aiming to gauge its effectiveness and educational value. SR involves deriving mathematical models from empirical data by finding symbolic representations that fit the data. We evaluate two SR algorithms, AI-Feynman and Φ-SO, using position data from objects in parabolic motion and damped oscillations. Our analysis demonstrates that SR algorithms can produce concise formulas to describe object motion. Integrating SR into physics education allows students to build on their prior knowledge of physics to formulate hypothetical symbolic terms and enhance their explanations of physical phenomena. Subsequently, students iteratively derive mathematical expressions from data, thereby nurturing a process of data-driven discovery. Furthermore, students can recognize the impact of technological advancements on scientific problem-solving. However, effective pedagogical strategies are necessary to guide students beyond mere derivation of mathematical expressions, encouraging them to interpret and elucidate models in meaningful scientific inquiries.

Keywords: Physics education, Symbolic regression, Machine learning, AI-Feynman, Φ -SO

Physics describes natural phenomena by constructing mathematical models. Extensive literature supports the view that mathematical modeling is integral to the field of physics. Hestenes notes that a physics model is inherently a mathematical model, where physical properties are represented by quantitative variables[1]. Deriving mathematical expressions that encapsulate observations or experimental measurements attempts to explain nature[2] and reason about underlying phenomena[3]. Some studies argue that a central challenge in physics is finding a symbolic expression that provides a simple yet accurate fit to a given data set[4], thereby elucidating the relationship between empirical observations and the system being studied[5].

Physics education researchers have highlighted the crucial role of students' effective use of mathematics and their comprehension of the physical meaning behind mathematical expressions in solving physics problems[6]. Sherin notes that students develop a conceptual understanding of a physical situation and then express this understanding through equations. Additionally, they can interpret an equation as a specific representation of a physical system[7]. Similarly, Hestenes emphasizes that the underlying models within physical principles and equations should be clearly articulated to students, as mathematical modeling is vital for grasping and explaining the link between scientific theories and empirical phenomena[1]. Brahmia et al. propose that mastering symbolic forms should be a learning objective in introductory physics courses, rather than merely reflecting the typical thinking patterns of physics students[8]. Mashood et al. argue that solving complex physics problems requires students to employ imagination to create systematic connections between real-world and mathematical structures, and to comprehend high-level modeling that integrates these relationships into coherent mathematical models[9].

Koza[5] coined the term ‘symbolic regression(SR)’ to describe the task of discovering a function in symbolic form that accurately corresponds to a provided finite dataset. An early example of this process is Kepler’s discovery of Mars' elliptical orbit after about 40 failed attempts over four years, which marked a scientific revolution[4]. Today, we can utilize symbolic regression with machine learning techniques to extract mathematical equations directly from data. While traditional regression methods fit parameters to predetermined equation forms, SR employs genetic programming[10], a technique inspired by Darwinian evolution, to explore both parameter values and equation structures simultaneously[5]. Whereas genetic programming-based methods achieve high prediction accuracy, they do not scale well to high dimensional data sets and are sensitive to hyper-parameters[11]. Recently, deep learning-based methods have also been applied to SR[2]. The search for a parsimonious and elegant form of the unknown equation[12] allows scientists to gain deeper insights into the underlying phenomena[13]. SR has been successfully applied in various fields, including particle kinematics[14], astrophysics[15], chemistry[16] and others.

Current physics teaching primarily follows a model-driven approach, where theories are learned deductively based on a priori knowledge. However, the real-world of science often employs a data-driven approach, where rules are discovered inductively from observational data. With the increasing ease of collecting large amounts of real-time data through various sensors and physical computing, it is becoming more feasible to utilize this data-driven approach in the classroom. Given the growing interest in the integration of artificial intelligence into science education, we suggest incorporating SR into physics education, considering its success in the natural sciences. In particular, the model selection process through SR can be highly effective in finding rules hidden in observed phenomena. A data-driven approach to physics education using SR is expected to complement traditional model-centered physics education.

As students engage in the generation, evaluation, and modification of scientific models, they develop a more profound comprehension of scientific principles and problem-solving abilities[17]. In a physics class using SR, when students build a model to explain an unfamiliar phenomenon, they derive hypotheses from observations and data about the phenomenon. The coherence of the student's hypothesis with the phenomenon can be verified by the size of the error in the derived equation. As students revise their hypotheses, they update the code, resulting in a revised model and error. Students take the initiative to iteratively modify their hypotheses and code to reduce error and improve the fit between the model and the phenomenon. This process is similar to the philosophy that scientific knowledge is advanced by ‘epistemic iteration,’ a process of building and refining knowledge by reexamining and revisiting premises at each step, rather than simply maintaining and repeating initially held beliefs[18].

For the use of SR in physics research to provide students with an indirect experience of the progression of scientific knowledge through the iterative model elaboration, it is essential first to study whether SR is feasible in an educational context. This should be followed by a discussion on the implications of introducing SR in physics education. We will focus on two algorithms, AI-Feynman[4] and Φ-SO[2], which are notable for their foundations in physics principles. AI-Feynman is known for its capability to identify the 100 equations presented in Feynman's physics lectures[4], while Φ-SO is a physical symbolic optimization method that recovers analytic functions from physical data[2]. Previous studies have demonstrated that both packages can be successfully performed on computer-generated data. However, to investigate the feasibility of SR in physics education, it is necessary to examine if SR can also be performed on noisy data collected by students in the laboratory. Therefore, we will apply both SR packages to real data collected in the laboratory, which distinguishes our work from previous studies[2, 4].

1. Data collection

1) Parabolic motion

To investigate the suitability of our SR algorithm for the experimental classes, we utilized it to derive a mathematical model for the peak of parabolic motion. The input for this model was the launch angle of an object in parabolic motion. The peak of parabolic motion can be obtained from the equations for velocity and position in the perpendicular direction, based on the launch angle. However, due to the involvement of trigonometric functions, experimental measurement has limitations. The parabolic motion data was collected using a parabolic launcher and the Tracker, as shown in Fig. 1. The experiment was repeated multiple times for each angle to account for potential errors caused by slight position changes resulting from the launch impact. The average of the peak values, excluding outliers, was used as the peak value for each angle. The data shown in Fig. 2 was a result of this process. We predicted the maximum height using the theoretical equation with the mean initial velocity of 2.06 m/s and the acceleration due to gravity of 9.807 m/s2. The Root Mean Square Error (RMSE) between the predicted and observed values was then calculated to be 1.560×10-3.

Figure 1. (Color online) Parabolic launcher and Tracker.

Figure 2. (Color online) Ideal and experimental peak.

2) Damped oscillation

This study aimed to drive a mathematical model of damped oscillations by SR. To collect experimental data, a spring pendulum was constructed by hanging a weight on a spring, as shown in Fig. 3, and oscillated in a graduated cylinder filled with water. During the motion of the pendulum, we were careful not to let the spring touch the water, because if the spring touched the water, it would affect the motion of the pendulum. To analyze the video, we set the point of connection between the pendulum and the ring as the point of capture in the Tracker. We collected over 500 decay position data points for the X and Y axis of the damped oscillator and used 50 of them for our analysis. The data shown in Fig. 4 was a result of this process. The RMSE between the fitted curve corresponding to the general solution of damped harmonic oscillation and the collected data was 2.064×10-3. The fitted curve was obtained from using the ‘scipy.optimize.curve_fit’ module in Python.

Figure 3. (Color online) Damped oscillator and Tracker.

Figure 4. (Color online) Experimental data and fitted curve.

2. Method

1) AI-Feynman

AI-Feynman is a SR algorithm known for its physics-inspired approach. According to Udrescu et al.[4], AI-Feynman's core principles involve discovering simple patterns in data and recursively finding exact equations. The authors argue that physics functions and many other scientific applications often exhibit simplifying properties that facilitate their discovery: ① Units: the function and its variables have known physical units; ② Low-order polynomial: the function (or part of it) is a low-degree polynomial; ③ Compositionality: the function is composed of a small set of elementary functions, each with no more than two arguments; ④ Smoothness: the function is continuous and possibly analytic in its domain; ⑤ Symmetry: the function shows translational, rotational, or scaling symmetry with respect to some of variables; ⑥ Separability: the function can be expressed as a sum or product of parts with distinct variables.

AI-Feynman discovers equations through a sequential process. First, it conducts dimensional analysis on the given data to simplify the problem and reduce the number of variables to dimensionless ones. Second, it uses polynomial fitting, which is a standard method for solving a system of linear equations, to determine the best-fit polynomial coefficients. Third, in the brute force step, all possible combinations of mathematical symbols, functions, and variables are tested to generate an equation that fits the given data set. Since there are sn strings of length using an alphabet of symbols, a subset of symbols has been developed for use during the brute force. After the termination of the brute force step, either when the maximum fitting error drops below a threshold or when a certain time limit has been exceeded, the neural networks are trained to identify simplifying properties, such as symmetry, separability, and compositionality. If any such properties are found, the equations are recursively transformed into simpler forms with fewer variables to optimize the functions. The software package was obtained by referring to the GitHub repository published by Udrescu et al.[4] available at https://github.com/SJ001/AI-Feynman.

2) Φ-SO

Petersen et al.[11] introduced the reinforcement learning-based deep SR framework which is the new standard for exact symbolic function recovery, particularly in the presence of noise. Building on this, Tenachi et al.[2] presented a Physical Symbolic Optimization (Φ-SO) framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints.

Tenachi et al.[2] regard symbolic expressions as binary trees where each node represents a symbol of the expression in the library of available symbols, i.e., an input variable (e.g., x,t), a constant (e.g., v0), or an operation (e.g., +,-,×,/,sin,log). Using the prefix notation and treating symbols, referred to as tokens, as categories allows us to treat any expression as a mere sequence of categorical vectors. Token sequences are generated by using a Recurrent Neural Network(RNN), which in essence, is a neural network that can be invoked multiple times to create a logical chain of similar operations. By adding physical units of tokens and the units required for the token to be generated to respect units rules, the inner mechanism of the neural network is made to take into account not only the local structure of the expression for generating the next token, but also local units constraints. Considering the constraints of symbolic arrangement of mathematical expressions based on physics, dimensional analysis is conducted. The approach adopted by AI-Feynman addresses SR problems by first transforming the variables to make them dimensionless. If dimensionless fails, AI-Feynman reverts to the original problem setup and results in high-order polynomials or complicated expressions. In contrast, Φ-SO is designed to yield only physically plausible expressions by construction all of the time.

Φ-SO uses a neural network to generate a categorical distribution and optimize the parameters according to fit quality and physical units constraints. The training of the network that generates the distribution of symbols relies on the reinforcement learning strategy. In this approach, a set of trial symbolic functions are generated and scalar reward for each function is computed by confronting it to the data. According to Peterson et al.[11], a risk-seeking policy was adopted which aims to maximize the reward for the few best-performing candidates rather than the average reward. This enables efficient exploration of the search space at the expense of average performance. The software package was obtained by referring to the GitHub repository published by Tenachi et al.[2] available at https://github.com/WassimTenachi/PhySO.

3. Analysis

The AI-Feynman algorithm can be executed in a Python environment using the ‘run_aifeynman’ module in AI-Feynman package. The main variables used in the module are as follows, as shown in Fig. 5, in order: ‘pathdir’, ‘filename’, ‘BF_try_time’, ‘BF_ops_file_type’, ‘polyfit_deg’, ‘NN_epochs’, and ‘vars_name.’ Here ‘pathdir’ presents path to the directory containing the data file; ‘filename’ presents the name of the file containing the data; ‘BF_try_time’ denotes the time limit for each brute force call; ‘BF_ops_file_type’ presents file containing the symbols to be used in the brute force code with the package providing four defaults sets; and ‘polyfit_deg’ variable means maximum degree of the polynomial tried by the polynomial fit routine, hence appropriate variable selection is crucial. Additionally, ‘NN_epochs’ variable presents number of epochs for the training; ‘vars_name’ variable means name of the variables appearing in the equation. It is important to note that the order and number of variables must match the data file. It is available to perform dimension analysis when specifying ‘vars_name’ variables with dimensions. After dimension analysis, the data used in algorithms becomes dimensionless. For example, in the case of parabolic motion where the dimensional array is [length, time, mass], gravitational acceleration g is [1, -2, 0], initial velocity v is [1, -1, 0], and height y is [1, 0, 0]. To make them dimensionless with a dimension array of [0, 0, 0], polynomial fitting and the brute force step are conducted with a new target of g·v-2·y.

Figure 5. (Color online) Example of using AI-Feynman module.

The Φ-SO algorithm can be run using the module ‘physo.SR’ in the Python environment. As shown in Fig. 6, the `X_array' means input data and `y_array' means target data. The variables `X_names', `X_units', `y_name', and `y_units' can be used to specify their names and dimensions. Constants and dimensions can be entered in the `free_consts_names' and `free_consts_units' variables. The dimensions can be set as length, time, mass, or other relevant dimensions, in any order. However, it is important to ensure that the order of the dimension array for all variables (e.g., X and y) and constants (e.g., v0 and g) match. The `run_config' includes configuration values for training and reward, and the package offers two preset values, `stop_reward' and `epochs,' that can determine when the algorithm stops. However, the default value for `stop_reward' is set to (exact match), which can be problematic when training on real experimental data where the reward value may not reach 1, resulting in longer training times. Therefore, it is important to select `stop_reward' as a value less than 1 and specify the value of `epochs' separately to utilize it in the experimental classes. `Epochs' is a parameter that determines the number of times the algorithm repeats. If not specified, it defaults to the value set in the config file is 109. Additionally, we performed a preprocessing step of min-max normalization, because the Φ-SO algorithm successfully produced the intended mathematical model when the target data was normalized.

Figure 6. (Color online) Example of using Φ-SO module.

1. Parabolic motion

Table 1 shows the SR results as mathematical models for the peak of parabolic motion. Figure 7 is a visualization of the regressed equation and theoretical calculations using the ‘matplotlib.pyplot’ package. The theoretical peak equation of parabolic motion is v02·sinθ·2g-1. However, the output of AI-Feynman algorithm is 0.503·sin2θ and the output of Φ-SO algorithm is v02·sin2θ·g-1. Since the AI-Feynman algorithm derives a model for dimensionless target data, careful interpretation is required. The dimension of the target value, height, is [length, time, mass] = [1, 0, 0]. The initial velocity v0 has the dimension [1, -1, 0], and the gravitational acceleration g has the dimension [1, -2, 0]. Therefore, the combination of v02·g-1 has the dimension [1, 0, 0], which is the same as the dimension of height. The coefficient of 0.503 in the output 0.503·sin2θ can thus be interpreted as the result of calculating v02·g-1. The value is slightly different from 0.433 obtained from the experimental data (v0=2.060 m/s and g=9.807 m/s2). The RMSE between the observed and predicted values, estimated by the AI-Feynman model, was 7.448×10-3. Note that for an effective comparison, we normalized the maximum heights of the observed and predicted values to unity.

Figure 7. (Color online) Results of symbolic regression for launch angle-peaks.


Results of symbolic regression for parabolic motion.


MethodAlgorithmEquationRMSERemarks
Theory-v02sin2θ2g1.560×10-3g=9.807 m/s2 v0=2.060 m/s
Symbolic RegressionAI-Feynman0.503·sin2θ7.448×10-3v022g=0.433
Φ-SOv02sin2θg6.801×10-3Normalized input data


The Φ-SO algorithm produces relatively similar results to the theoretical formula. However, the denominator of the algorithm's output model was g, while the denominator of the theoretical formula was 2g. This discrepancy suggests that proper tuning of the algorithm parameters is necessary to achieve a fully consistent formula, and careful interpretation is required depending on the context. The RMSE between the observed and predicted values, estimated by the Φ-SO algorithm, after the normalization was 6.801×10-3.

2. Damped oscillation

Table 2 shows the SR results as mathematical models for damped oscillation. Figure 8 is a visualization of the regressed formula and fitted curve on the experimental data using the ‘matplotlib.pyplot’ package. The formula for the curve fitted with the general solution of damped oscillation is Ae-αtcos(ωt+φ)+B. AI-Feynman outputs the SR model as e-x1cos(x0+x2)+1.021, while Φ-SO outputs e-αtcos(ωt). The AI-Feynman algorithm requires careful interpretation, since the model is derived for dimensionless target data.

Figure 8. (Color online) Results of symbolic regression for damped oscillation.


Results of symbolic regression for damped oscillation.


MethodAlgorithmEquationRMSERemarks
Curve Fitting (General solution)-Ae-αtcos(ωt+φ)+B2.064×10-3Obtained from ‘scipy.optimize.curve_fit’module
Symbolic RegressionAI-Feynmane-x1cos(x0+x2)+1.0212.526×10-3x0=ωt x1=αt x2=φ
Φ-SOAe-0.017ωtcos(ωt)2.015×10-2Normalizedinput data


Since the target value of the experimental data has a dimension of [1, 0, 0], the dimensions of the constants are entered as [0, 1, 0], the dimensions of the constants ω and α are entered as [1, 0, 0], and the dimension of φ is entered as [0, 0, 0]. We found that after dimensional analysis, the input data has been reduced to three dimensionless variables: x0=ωt, x1=αt and x2=φ. Therefore, the output model of the AI-Feynman algorithm can be reinterpreted as eαtcos(ωt+φ)+1.021, where 1.021 can be interpreted as A-1·B, which differs from the value of 0.940 obtained by curve fitting. The RMSE between the observed and predicted values, estimated by the AI-Feynman algorithm, after the normalization was 2.526×10-3.

From the experimental data of damped oscillation, the Φ-SO algorithm derived the regression formula in the basic form of the general solution of damped oscillation. The formula for the part where the amplitude decays exponentially was expressed as ecos(2φ)·ωt. Here, cos(2φ) is approximately -0.017 for φ=2.348, which corresponds to the output by the Φ-SO algorithm. Additionally, in the formula for the oscillating part, cos(ωt), the ω value was 8.804, while the frequency observed in the data was 8.955. The RMSE between the observed and predicted values, estimated by the Φ-SO algorithm, after the normalization was 2.015×10-2.

The results of our study indicate that symbolic regression(SR) can effectively derive equations from real data. In our direct application of SR, idealized data generated by parabolic and damped oscillation formulas yielded satisfactory regressions with minimal effort. However, real data collected from experiments contained noise, leading to long learning times and more complex regression equations. Recognizing that perfect denoising is impractical in a lab setting, we used our prior knowledge of kinematics to infer the shape of the data and selectively excluded unnecessary functions from the options for creating the SR model. In our analysis of parabolic motion data, we leveraged the understanding that the initial velocity of the object can be decomposed into Cartesian axes, allowing the vertical component of the velocity to be expressed using the sine function. Additionally, we applied the knowledge that the peak of parabolic motion is related to the vertical velocity component. In the context of the damped oscillation, we employed our knowledge of the general solution of the differential equation for damped oscillation that the amplitude decreases exponentially while the frequency remains constant despite damping. This informed our approach to regress motion expressions as a blend of cosine functions with maximal initial values and exponential functions. Accordingly, we adjusted hyperparameters, selectively excluding functions other than our target set of sine, cosine, and exponential functions, to minimize errors.

Moreover, during the process of attempting the regression, many cases arose where functions similar in form to cosine, such as logarithms and arcsines, appeared. These functions were removed to reduce learning time and the complexity of equations. Regarding the method of restricting functions in AI-Feynman and Φ-SO algorithms, in the AI-Feynman algorithm, a new operator code was inputted into a text file specified in the ‘BF_ops_file_type’ variable of the ‘run_aifeynman’ module. In the Φ-SO algorithm, a list containing the names of functions was inputted into the ‘op_names’ variables of the SR module. This enabled us to obtain the expression relatively quickly. However, when students apply SR to an unfamiliar phenomenon, they must carefully observe the phenomenon, adjust terms in the equation, identify errors, and iteratively converge toward a well-fitting model. In this way, SR is expected to serve as a collaborative tool for students to model unfamiliar phenomena.

Physics classes integrating SR are expected to offer an innovative approach to overcome the limitations of previously proposed physics education programs incorporating artificial intelligence. In previous studies, deep neural network(DNN) algorithms have been proposed to be used to analyze and predict the motion of objects[19-21]. Students in these studies are guided to create a model using experimental data, make predictions with test data, and verify them using validation data, while also learning about the models’ loss or error based on actual and predicted values. However, mere predictive success of ML model may not suffice for explanatory understanding unless the tool offers some explanatory information[22]. In contrast, mathematical models derived from SR facilitate the comprehension of the relationships between variables and system behavior, enhancing students’ explanation of physical phenomena.

The problem-solving approach of both algorithms, AI-Feynman and Φ-SO, is significant in terms of physics education. AI-Feynman and Φ-SO involve dimension analysis process which is powerful when it is applied to a complete mathematical model in algebraic (differential and/or integral) form. It is remarkably productive even when a complete model is unknown or unwieldy and the analysis must be applied to a simple list of the relevant variables[23]. As dimensionless variables and scaling are powerful tools for emphasizing the universality of natural laws[24], it is taught in higher physics education. However, in AI-Feynman and Φ-SO algorithms, dimension analysis is automatically implemented and is not visible to students. Therefore, instructors can explicitly point out the dimension analysis of both algorithms and teach the importance and role of dimensionlessness in physics problem solving. It is expected that students will be able to identify the relationships between variables through dimensional analysis and learn that such a process contributes to simplifying complex systems.

Further discussion is needed to determine whether deriving mathematical equations from data using SR can be considered authentic scientific modeling. However, we identified the possibility of improving the efficiency of the modeling process with SR. Hestenes[1] proposed the scientific modeling strategy as description, formulation, ramification, and validation—in the order of their implementation. In this study, we derived formula of parabolic motion and damped oscillation using SR through a process of data collection, machine learning, formulation, and validation. Specifically, SR demonstrated time savings in the data-driven formulation. Realizing that it took Kepler four years to discover the elliptical orbit of Mars, but that SR greatly reduces the time needed to derive Kepler's laws compared to traditional methods, gives students insight into the impact of technology on the scientific problem-solving process.

To successfully implement SR in physics classes, it is essential to first explore appropriate instructional strategies. If students merely derive mathematical equations from data and cease further inquiry, it may be perceived as a mere description of phenomena. The scientific method of Whewell[25] contains observing a phenomenon and then discovering an idea that connects the observations by introducing ‘some general conception,’ which is given, not by the phenomena, but by the mind. While Kepler introduced an idea that connects Brahe’s observation, it was Newton who delved into the underlying mechanisms behind these patterns and provided insightful explanations to demonstrate why planets move in such patterns[26]. In order to offer students an opportunity to engage with scientists' discoveries, physics lessons utilizing SR should go beyond deriving equations to attempting to explain the phenomena hidden in the data with physics theory.

According to Schmidt & Lipson[10], in answer to the question of whether SR will diminish the role of scientists in the future, they argue that it may actually help scientists focus on interesting phenomena more rapidly and interpret their meaning. To engage students with captivating phenomena, it is imperative to offer ample chances for exploration and discovery within the physics classroom. While this study showcases the promise of SR as a tool to assist students in mathematical modeling in physics research, further research is necessary to gauge its efficacy at the cognitive level of students. Therefore, conducting further studies that integrate SR into actual physics classes and assess its influence on students' scientific problem-solving processes would be highly beneficial.

This work was supported by the Creative-Pioneering Researchers Program through Seoul National University.

  1. D. Hestenes, Toward a modeling theory of physics instruction, Am. J. Phys. 55, 440 (1987).
    CrossRef
  2. W. Tenachi, R. Ibata and F. Diakogiannis, Deep Symbolic Regression for Physics Guided by Units Constraints: Toward the Automated Discovery of Physical Laws, Astrophys. J. 959, 99 (2023).
    CrossRef
  3. N. Makke and S. Chawla, Interpretable scientific discovery with symbolic regression: a review, Artif. Intell. Rev. 57, 2 (2024).
    CrossRef
  4. S. Udrescu and M. Tegmark, AI Feynman: A physics-inspired method for symbolic regression, Sci. Adv. 6, 2631 (2020).
    CrossRef
  5. J. Koza, Genetic programming as a means for programming computers by natural selection, Stat. Comput. 4, 87 (1994).
    CrossRef
  6. H. Yoon, Types and Characteristics of Physics Education Research Based on Conceptual Blending Theory, New Phys.: Sae Mulli 74, 182 (2024).
    CrossRef
  7. B. Sherin, How Students Understand Physics Equations, Cogn. Instr. 19, 479 (2001).
    CrossRef
  8. S. Brahmia, et al., Physics Inventory of Quantitative Literacy: A tool for assessing mathematical reasoning in introductory physics, Phys. Rev. Phys. Educ. Res. 17, 020129 (2021).
    CrossRef
  9. K. Mashood, et al., Participatory approach to introduce computational modeling at the undergraduate level, extending existing curricula and practices: Augmenting derivations, Phys. Rev. Phys. Educ. Res. 18, 020136 (2022).
    CrossRef
  10. M. Schmidt and H. Lipson, Distilling Free-Form Natural Laws from Experimental Data, Science 324, 81 (2009).
    CrossRef
  11. B. Petersen, et al., Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients, International Conference on Learning Representations (2021).
    CrossRef
  12. Z. Khoo, A. Yang, J. Low and S. Bressan, Database and Expert Systems Applications (DEXA 2023), Lecture Notes in Computer Science (Springer, Cham, 2023), Vol. 14147.
    CrossRef
  13. M. Virgolin and S. Pissis, Symbolic regression is NP-hard, arXiv preprint arXiv:2207.01018.
    CrossRef
  14. Z. Dong, K. Kong, K. Matchev and K. Matcheva, Is the machine smarter than the theorist: Deriving formulas for particle kinematics with symbolic regression, Phys. Rev. D 107, 055018 (2023).
    CrossRef
  15. P. Lemos, et al., Rediscovering orbital mechanics with machine learning, Mach. Learn.: Sci. Technol. 4, 045002 (2023).
    CrossRef
  16. A. Hernandez, et al., Fast, accurate, and transferable many-body interatomic potentials by symbolic regression, npj Comput. Mater. 5, 112 (2019).
    CrossRef
  17. J. Clement, Creative Model Construction in Scientists and Students (Springer Netherlands, Dordrecht, 2008).
    CrossRef
  18. H. Chang, Inventing temperature: Measurement and scientific progress, 1st ed, Korean translation (East-Asia publishing Co., 2013).
  19. U. Hong, J. Jang and S. Chae, Development of No-Code AI Convergence Education Teaching Material Using Graphical Workflow:KNIME-a Data Analysis Platform, Sch. Sci. J. 17, 34 (2023).
    CrossRef
  20. J. Lee, J. Jo and S. Chae, Development of Data-driven Teaching Material for AI Convergence Education: Focused on Damped Oscillation, Sch. Sci. J. 15, 121 (2021).
    CrossRef
  21. U. Hong, E. Shin, J. Jang and S. Chae, An Analysis of Students' Experiences Using the Block Coding Platform KNIME in a Science-AI Convergence Class at a Science Core High School, J. Korean Assoc. Sci. Educ. 44, 141 (2024).
    CrossRef
  22. T. Räz and C. Beisbart, The Importance of Understanding Deep Learning, Erkenntnis 89, 1823 (2024).
    CrossRef
  23. S. Churchill, A New Approach to Teaching Dimensional Analysis, Chem. Eng. Educ. 31, 158 (1997).
  24. J. Bissell, A. Ali and B. Postle, Illustrating dimensionless scaling with Hooke's law, Phys. Educ. 57, 023008 (2022).
    CrossRef
  25. W. Whewell, The Philosophy of the Inductive Sciences, Founded upon their History, 2 (John W. Parker; reprinted London: Routledge/Thoemmes Press, 1996, London, 1840).
  26. Z. Li, J. Ji and Y. Zhang, From Kepler to Newton: Explainable AI for Science, arXiv:2111.12210.
    CrossRef

Stats or Metrics

Share this article on :

Related articles in NPSM