Introduction to WEKA tool

    WEKA (Waikato Environment for Knowledge Analysis) is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from Java code. WEKA contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. The name WEKA is derived from a flightless bird with an inquisitive nature. WEKA is open source software issued under the GNU General Public License. WEKA tool can also be use for with Big Data analysis. Since WEKA is freely available for download and offers many powerful features (sometimes not found in commercial data mining software), it has become one of the most widely used data mining systems. Nowadays WEKA also became one of the favorite vehicles for data mining research and helped to advance it by making many powerful features available to all. In data and web mining following are the main things:
• Data preprocessing and visualization
• Attribute selection
• Classification (OneR, Decision trees)
• Prediction (Nearest neighbor)
• Model evaluation
• Clustering (K-means, Cobweb)
• Association rules

WEKA provides support for all those things

1. Data preprocessing and visualization:
WEKA tools provides facilities for loading data from various databases or other sources either by importing data through files or directly connecting to the database server using appropriate drivers. WEKA tool also provides various data converters such as CSV, c45, Arff. Following are the file formats supported by WEKA tool for direct import of data.
1. CSV
2. Arff (Attribute-Relation File Format)
3. C45
5. LibSVM
6. .dat
7. .data

    WEKA tools also provide support for viewing this data after loading it. Now after loading data view can apply various filters on this data to get the graphs (required information) from this data. These filters are usually of two types
1. Supervised
2. Unsupervised
2. Attribute selection

     After loading data if we want to see attribute specific data then this can be done selecting specific attribute showed in the bottom-left corner of WEKA Explorer window and selecting specific filter. Example: I wanted to analyze the data of registration of my collage’s national level event “PRAGYAA” for year 2014. I have this data in .CSV format, consider after loading it to WEKA tool I wanted to see that how many male and female candidates did registration for the year 2014, the simply I need to select the gender attribute from this data and WEKA will show me required results as shown in following figure:



Popular posts from this blog

MATLAB code for Circular Convolution using Matrix method

Positive number pipe in angular 2+