Thursday, August 26, 2021

Python - Common Functions for Exploratory Data Analysis

In this tutorial, we will learn "Common Functions for Exploratory Data Analysis" in our Data Science processes by using Python.

Python is one of the fastest growing programming languages.
1. Whether it’s data manipulation with Pandas,
2. Creating visualizations with Seaborn, or
3. Deep learning with TensorFlow,
Python seems to have a tool for everything.

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with structured (tabular, multidimensional, potentially heterogeneous) and time series data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.

NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

In the Data Science, in the most cases is not to explore the data but it is something about to analyze the data in some way, often through a model.

Pandas DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It is similar to an excel spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. Like Series, DataFrame accepts many different kinds of input:

• Dict of 1D ndarrays, lists, dicts, or Series

• 2-D numpy.ndarray

• Structured or record ndarray

• A Series

• Another DataFrame

Along with the data, we can optionally pass index (row labels) and columns (column labels) arguments. After passing an index and / or columns, we are guaranteeing the index and / or columns of the resulting DataFrame. Thus, a dict of Series plus a specific index will discard all data not matching up to the passed index

1. Pandas.Dataframe.describe() is very informative function which is used to generate descriptive statistics of the data in a Pandas DataFrame or Series. It summarizes central tendency and dispersion of the dataset. describe() helps in getting a quick overview of the dataset.

2. Head and tail functions - If you want to view a small sample of a Series or DataFrame object, use the head() and tail() methods. The default number of elements to display is five, but you may pass a custom number.

import pandas as pd

import numpy as np

 

#Create a series with random numbers

s = pd.Series(np.random.randn(400))

 

#The first two rows of the data series:

print(s.head(2))

 

#The last two rows of the data series:

print(s.tail(2))

 

#Create a Dictionary of series

d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),

'Age':pd.Series([25,26,25,23,30,29,23]),

'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

 

#Create a DataFrame

df = pd.DataFrame(d)

 

#The first two rows of the data frame

print(df.head(2))

 

#The last two rows of the data frame

print(print df.tail(2))


As we know Exploratory Data Analysis (EDA) is one of the most essential part of your data science process.

To learn more, please follow us -

http://www.sql-datatools.com
To Learn more, please visit our YouTube channel at -
http://www.youtube.com/c/Sql-datatools
To Learn more, please visit our Instagram account at -
https://www.instagram.com/asp.mukesh/
To Learn more, please visit our twitter account at -
https://twitter.com/macxima

8 comments:

  1. JSC offers end-to-end China company registration package allow you to start business in China easily and legally. Help clients to prepare all the documents needed to obtain the business license. Legal representative no need to be in China.

    ReplyDelete
  2. Hire Python Freelancer for Implementation Consulting & Outsourcing of Projects. Hire Python freelancer

    ReplyDelete
  3. In the first place, you need to indicate your case since there are for all intents and purposes huge number of attorneys accessible. how to get the most money from a car accident
    how to find a personal injury lawyer

    ReplyDelete
  4. So lot to occur over your amazing blog. Your blog procures me a fantastic transaction of enjoyable.. Salubrious lot beside the scene. spss data analysis help

    ReplyDelete
  5. I am always searching online for storys that can accommodate me. There is obviously a multiple to understand about this. I feel you made few salubrious points in Attributes moreover. Detain busy, awesome career! Help With Data Analysis For Dissertation

    ReplyDelete
  6. This is one of the most misconstrued Laws of Life because of the conviction that all situation are fated to come to pass because of some past activity. brooklyn car accident lawyer
    do i need a personal injury lawyer

    ReplyDelete
  7. The business index on paper structure had their prime for a long time, however the populace currently goes to the Internet for the data they look for, so most print registries are gathering dust.slip and fall injury attorney
    attorney medical malpractice

    ReplyDelete
  8. At the point when an individual has encountered a physical issue because of the carelessness of one more party it is a generally excellent choice to talk with an in private attorney injury.truck accident lawyer
    truck accident lawyers

    ReplyDelete

Popular Posts