Application of Fuzzy Inference System in the Prediction of Air Quality Index

Air pollution is the presence of substances in the atmosphere that are harmful to the health of humans and other living beings. It is caused by solid and liquid particles and certain gases that are suspended in the air. The air pollution index (API) or also known as air quality index (AQI) is an indicator for the air quality status at any area. It is commonly used to report the level of severity of air pollution to public and to identify the poor air quality zone. The AQI value is calculated based on average concentration of air pollutants such as Particulate Matter 10 (PM10), Ozone (O3), Carbon Dioxide (CO2), Sulfur Dioxide (SO2) and Nitrogen Dioxide (NO2). Predicting the value of AQI accurately is crucial to minimize the impact of air pollution on environment and human health. The work presented here proposes a model to predict the AQI value using fuzzy inference system (FIS). FIS is the most well-known application of fuzzy logic and has been successfully applied in many fields. This method is proposed as the perfect technique for dealing with environmental well known and tackling the choice made below uncertainty. There are five levels or indicators of AQI, namely good, moderate, unhealthy, very unhealthy, and hazardous. This measurement is based on classification made from the Department of Environment (DOE) under the Ministry of Science, Technology, and Innovation (MOSTI). The results obtained from the actual data are compared with the results from the proposed model. With the accuracy rate of 93%, it shows that the proposed model is meeting the highest standard of accuracy in forecasting the AQI value.


INTRODUCTION
In the era of globalization and the age of industrialization 4.0, environmental pollution is often one of the topics discussed in the media. Most countries in the world, including Malaysia, have the same problem. Many regions in Malaysia are facing extreme air quality issues. In Malaysia, state, local and federal associations are involved in identifying where air pollution occurs, causes of air pollution, and how to control air pollution. Air pollution is caused by many reasons such as vehicle exhaust fumes, fossil fuelbased power plants, exhaust from industrial factories and plants, agricultural and construction activities etc. It may cause diseases, allergies and even death to humans. It may also cause harm to other living organisms such as animals and food crops and may damage the natural environment. Observing AQI accurately is essential to detect pollution peaks and make early preventive measures that can minimize the impact of air pollution on environment and human health.
Previous studies have used various approaches to predict the air quality index (AQI). In this study, we focus on fuzzy inference system (FIS) to predict the AQI value. According to Chaudhari and Patil (2014), FIS approach has an expressive output strength without difficulty to understand the results and manipulate the goal. FIS can capture converting environment as a professional information and without problems. The very first application of this technique was to control a steam engine and the result obtained was as good as when controlled by experienced human operators. According to Sowlat et al. (2011) FIS has been effectively used in many fields including programmed control, data arrangement, decision analysis, skillful systems, and computer vision. For this reason, fuzzy inference systems are referred to by various names, such as fuzzy expert systems, fuzzy rule-based systems, fuzzy skilled systems, fuzzy modeling, fuzzy associative memory, fuzzy logic controllers and ambiguously fuzzy systems etc. FIS are broadly applicable in finance, scientific, economic, and engineering etc. due to the intuitive nature of the system and its ability to investigate decisions of a human. Samira and Ahmad (2016) found that FIS is the most well-known application of fuzzy logic where membership functions should be generally adjusted manually through trial and error. According to Cavallaro (2015), fuzzy logic reasoning consists of two kinds of information. The first concerns with the labels and membership functions allocated to the input and output variables. The correct desire of those symbolizes is one of the greatest serious degrees inside the design model. Another kind of information is associated with rule base which is a technique of converting the fuzzy values of the inputs to fuzzy values of the outputs. Kumaravel (2012) concluded that the important elements in FIS are fuzzification, fuzzy decision, fuzzy rule-based and defuzzification.
According to Tiwari (2015) FIS has assumed that each rule is activated at each cycle and contributes collectively to the solution. It is a parallel single inference. However, the inference process can be preserved as the new inferred results that can be fed once more as inputs. Methods based on the fuzzy set theory must be carried out within the context of environmental numbers. The limits among the right and an unacceptable concentration is not to be considered as sharp, but as fuzzy, with implications for subsequent movement plans. There the usage of fuzzy numbers is proposed as the perfect technique for dealing with environmental well known and tackling the choice made below uncertainty.

Method of Data Collection
The hourly data of AQI used in this study are secondary data obtained from Department of Environment under the Ministry of Science, Technology, and Innovation (MOSTI) for Kelantan state. The are 122 data sets collected from 1 st June to 31 st July 2016. It was recorded from Continuous Air Quality Monitoring Stations (CAQM) located in both industrial and urban areas in Kelantan.

Rating scales for Air Quality Index (AQI)
The rating scale of the AQI status indicator consists of Good, Moderate, Unhealthy, Very Unhealthy, and Hazardous as shown in Table 1. This measurement is based on classification made from the Department of Environment (DOE) under the Ministry of Science, Technology, and Innovation (MOSTI). There are five major pollutants that contribute to air pollution, namely Sulfur Dioxide (SO2), Nitrogen Dioxide (NO2), Carbon Dioxide (CO), Ozone (O3) and Particulate Matter (PM10). The most significant input variables used in this study are PM10, CO and NO2.

Data Analysis
Data analysis is done by using MATLAB. It is a completely effective software that has numerous built-in tools for solving problems and for graphical illustrations. The method of data analysis used to predict the AQI using FIS consists of four steps; 1) Fuzzification of input and output variables 2) Fuzzy decision (selection of membership functions for input and output variables), 3) Fuzzy rule-based (determination of rule base application) and 4) Defuzzification. The structure of a fuzzy system is shown in Figure 1.

I) Fuzzification of Input and Output Variables
Fuzzification is a process of transforming or changing the real crisp value into the fuzzy value. The operation interprets real crisp input or else measured values into linguistic ideas using appropriate membership functions. Fuzzy logic implements human experiences and preferences via membership functions and fuzzy rules. In fuzzy logic, each linguistic variable is associated with confidence values such that each term has its own confidence value. Appropriate fuzzy linguistic values are allocated for each fuzzy variable. The input and output are converted in the form of linguistic layout. As a result, the input value is used to generate output in the rule viewer as given in Equation (1).
(1) Figure 2 shows the fuzzy inference system that consists of three inputs (PM10, CO and NO2) and one output which is air quality index (AQI). By using FIS editor and membership function editor from MATLAB software all the membership functions are developed.

II) Fuzzy Decision -Selection of Membership Functions for Input and Output Variables
Linguistic values are expressed inside the method of fuzzy sets. A fuzzy set is defined by its membership functions. Generally, triangular and trapezoidal membership functions are utilized to standardize the crisp for its simplicity and computational performance. It is defined mathematically in the following equations; Triangular function in Equation 2 is defined by a lower limit a, an upper limit b, and a value m, where a < m < b. The Triangular membership function is used to transform the linguistic values into a range of 0 -1. (2)

Figure 3: Triangular Membership Function
Trapezoidal function in Equation 3 is defined by a lower limit a, an upper limit d, a lower support limit b, and an upper support limit c, where a < b < c < d.
(3)  Table 2 below shows the fuzzy set input indicators and the fuzzy set of output indicators. Table 3 shows the range of each input and Table 4 shows the parameters used in the membership function editor for each input.    Figure 8 shows the membership function editor for output AQI. The membership function input category was obtained by setting up the range of each input variable as shown in Table 4. This is then, followed by setting up the parameter for each input as shown in Table 5  The membership functions of input 1, 2 and 3 and also output are shown in details in Table 5. The functions are formulated based on equation (2) and (3). The range of output is between 3 to 14. Table 6 shows the range of output and Table 7 shows the parameters that are used in the membership function editor for output data.

III) Determination of Rule Base Application
The rules decided the membership functions of input and output are utilized in fuzzy inference process. The first rules are linguistic and also known as "IF-THEN" regulations. The suggested fuzzy model is based on the Mamdani fuzzy model structure as given Equation 4. Ri: IF X₁ is L₁^X₂ is L₂^X3, …is LM-1^Xᴍ is Lͥ ᴍ THEN μʀͥ → Y (4) Where i: 1,…n Lͥ €{Lj, k} j: 1,…, m; k: 1,...,n; n: the number of linguistic labels defined for the j input variables Table 8 shows the arrangement of fuzzy number for each input variable. It is used to produce IF-THEN rules. The calculation below shows the steps of how to find the minimum and maximum value for the input variables that are going to be used to find the arrangement of fuzzy number for output variables.

IF (PM10 is severe) and (CO is poor) and (NO2 is poor) THEN (AQI is hazardous) IF (PM10 is severe) and (CO is poor) and (NO2 is moderate) THEN (AQI is hazardous)
IF (PM10 is severe) and (CO is poor) and (NO2 is excellent) THEN (AQI is very unhealthy) Figure 9 shows the rule editor that is set up in the MATLAB software. There are 96 rules used to evaluate the AQI by using Mamdani IF-THEN rules to determine whether it is good, moderate, unhealthy, very unhealthy, and hazardous.

IV) Defuzzification Process
The defuzzification transforms the fuzzy value into a crisp value. In this study, a centroid method is applied, which is one of the greatest not unusual methods for defuzzification process. Fuzzy number is geometrically determined by using this approach.
After the rules are set, the data has been inserted in the rule viewer and the result is generated as shown in the right-end column of Figure 10.

FINDINGS AND DISCUSSIONS
The results of AQI by using Mamdani Fuzzy Inference System are shown in Table 9. The values in Table  9 are part of 122 outputs in overall. The status in the linguistic value is all "Good". The results of actual data output are shown in Table 10. The value is obtained based on the 96 If-Then rules that are mentioned in the early subtopic of data analysis. For example, IF (PM10 is Excellent) and (CO is Excellent) and (NO2 is Excellent) THEN (AQI is Good) with the input value PM10 is 26.404, CO is 0.4325 and NO2 is 0.008.  Table 11 shows the comparison statements of the result from actual data output and Mamdani fuzzy inference status for AQI reading. The statement is "TRUE" when the linguistic value for actual data is "Good" and Mamdani status from MATLAB is also "Good".This means that when linguistic value for actual data and Mamdani status are same, the statement is TRUE. However, if they both are with different linguistic value then the statement is "FALSE". The comparison statements in Table 11 are part of 122 statements in overall. The results of comparison statements in overall are represented in Figure 11. From the total data collected within 122 days, there are 113 TRUE statements and 9 FALSE statements. That means there are 113 correct responses out of 122 data.

Figure 11: Number of TRUE and FALSE statements
The level of accuracy rate is calculated as follows.

CONCLUSION AND RECOMMENDATIONS
The main objective of this study is to predict the value AQI accurately using a proposed model of Fuzzy Inference System. A model of fuzzy inference system (FIS) is developed to evaluate the air quality index (AQI) of air pollution by considering the value of three major inputs, namely PM10, CO and NO2. The evaluation system is based on Mamdani fuzzy inference system through four major steps: Fuzzification process, Fuzzy decision, Fuzzy rule-based and Defuzzification. The higher value of accuracy rate (93%) has shown that the proposed model of fuzzy inference system using Mamdani approach works relatively well and efficient in predicting the Air Quality Index (AQI). Thus, the model used is perfect and ideal to access the air quality index (AQI) accurately.
This study has applied Mamdani Rule-Based system to analyse AQI data. In future it might be possible to analyze the data by using other fuzzy inference system approaches, such as Sugeno Fuzzy Inference System.