@hygull

Rishikesh Agrawani

rishikesh0014051992@gmail.com


1. DataFrame, numpy, pandas, file handling

Getting a Series from DataFrame (column 'a') where values are > 100 finally converting it in a list

Problem link on stackoverflow is at https://stackoverflow.com/questions/51549323/creating-a-list-from-data-frame...
#Python2.7, #pandas, #slicing, #list #DataFrame, #Series
Only df[df['a'] > 100].loc[:, 'a'] or df[df['a'] > 100].loc[:, 'a'].tolist() is sufficient.
Selecting the rows from column `a` where value is > 100.
    >>> df[df['a'] > 100].loc[:, 'a']
    4      101
    302    101
    Name: a, dtype: int64
    >>>
    >>> type(df[df['a'] > 100].loc[:, 'a'])
    «class 'pandas.core.series.Series'»
Converting the above Series into list.
    >>> l = df[df['a'] > 100].loc[:, 'a'].tolist()
    >>> l
    [101, 101]
    >>>
    >>> type(l)
    «class 'list'»
    >>>
Let's look at the above code in more detail.
    >>> import numpy as np
    >>> import pandas as pd
    >>>
    >>> arr = [[100, 57, 23], [99, 56, 23],
    ... [100, 56, 20], [101, 57, 23], [99, 50, 23],
    ... [99, 51, 29], [101, 57, 22]]
    >>>
    >>> columns = [ch for ch in 'abc']
    >>> indices = [str(n) for n in [1, 2, 3, 4, 300, 301, 302]]
    >>>
    >>> df = pd.DataFrame(arr, index=indices, columns=columns)
    >>> df
         a   b   c
    1    100  57  23
    2     99  56  23
    3    100  56  20
    4    101  57  23
    300   99  50  23
    301   99  51  29
    302  101  57  22
    >>>
    >>> df['a'] > 100
    1      False
    2      False
    3      False
    4       True
    300    False
    301    False
    302     True
    Name: a, dtype: bool
    >>>
    >>> arr2 = df.loc[:,'a']
    >>> arr2
    1      100
    2       99
    3      100
    4      101
    300     99
    301     99
    302    101
    Name: a, dtype: int64
    >>>
    >>> arr2 = df[df['a'] > 100]
    >>> arr2
         a   b   c
    4    101  57  23
    302  101  57  22
    >>>
    >>> arr3 = df[df['a'] > 100].loc[:, 'a']
    >>> arr3
    4      101
    302    101
    Name: a, dtype: int64
    >>>
    >>> l = arr3.tolist()
    >>> l
    [101, 101]
    >>>