Skip to content

Latest commit

 

History

History
484 lines (393 loc) · 7.75 KB

pandas.dataframe的几种创建方式.md

File metadata and controls

484 lines (393 loc) · 7.75 KB

dataframe的几种创建方式

import pandas as pd

字典创建

dic1 = {"name": ["小明", "小红", "小孙"],
        "age": [20, 18, 27],
        "sex": ["男", "女", "男"]
        }
pd.DataFrame(dic1)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
name age sex
0 小明 20
1 小红 18
2 小孙 27

json 创建

json_arr = [
  {
  "name": "京基智农",
  "no": "000048",
  "url": "http://stock.jrj.com.cn/share,000048.shtml",
  "price": 17.7,
  "up_or_down": "-0.39%",
  "num_ratio": 0.76,
  "change_ratio": "0.08%",
  "pe": 8.6
  },
  {
  "name": "广弘控股",
  "no": "000529",
  "url": "http://stock.jrj.com.cn/share,000529.shtml",
  "price": 6.34,
  "up_or_down": "0.32%",
  "num_ratio": 1.64,
  "change_ratio": "0.45%",
  "pe": 12.24
  },
  {
  "name": "龙大美食",
  "no": "002726",
  "url": "http://stock.jrj.com.cn/share,002726.shtml",
  "price": 10.95,
  "up_or_down": "2.05%",
  "num_ratio": 1.37,
  "change_ratio": "0.8%",
  "pe": 19.07
  }
]

pd.DataFrame(json_arr)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
name no url price up_or_down num_ratio change_ratio pe
0 京基智农 000048 http://stock.jrj.com.cn/share,000048.shtml 17.70 -0.39% 0.76 0.08% 8.60
1 广弘控股 000529 http://stock.jrj.com.cn/share,000529.shtml 6.34 0.32% 1.64 0.45% 12.24
2 龙大美食 002726 http://stock.jrj.com.cn/share,002726.shtml 10.95 2.05% 1.37 0.8% 19.07

与下面的Series类似

嵌套字典创建

dic2 = {'数量': {'苹果': 3, '梨': 2, '草莓': 5},
        '价格': {'苹果': 10, '梨': 9, '草莓': 8},
        '产地': {'苹果': '陕西', '梨': '山东', '草莓': '广东'}
        }
pd.DataFrame(dic2)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
数量 价格 产地
苹果 3 10 陕西
2 9 山东
草莓 5 8 广东

列表创建

lst = ['小明','小红', '小黄']
df1 = pd.DataFrame(lst, columns=["姓名"])
print(df1)
# 修改索引
# df2 = pd.DataFrame(lst, columns=["姓名"], index=[1,2,3])
# print(df2)
   姓名
0  小明
1  小红
2  小黄

列表嵌套创建

lst = [["小明", "20", "男"],
       ["小红", "23", "女"],
       ["小周", "19", "男"],
       ["小孙", "28", "男"]
       ]
pd.DataFrame(lst, columns=["姓名", "年龄", "性别"])
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
姓名 年龄 性别
0 小明 20
1 小红 23
2 小周 19
3 小孙 28

元组创建

tup = ("小明", "小红", "小周", "小孙")
df12 = pd.DataFrame(tup, columns=["姓名"])
print(df12)
   姓名
0  小明
1  小红
2  小周
3  小孙
tup2 = [("小孙", "男", "12", "1991-03-13"), ("小明", "男", "12", "1991-03-13"), ("小红", "男", "12", "1991-03-13")]
pd.DataFrame(tup2, columns=["姓名", "性别", "年龄", "出生日期"])
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
姓名 性别 年龄 出生日期
0 小孙 12 1991-03-13
1 小明 12 1991-03-13
2 小红 12 1991-03-13

这种方式与从mysql中提取创建方式类似, 区别在于mysql 返回的是元组

使用 Series 创建

series = {'水果': pd.Series(['苹果', '梨', '草莓']),
'数量': pd.Series([60, 50, 100]),
'价格': pd.Series([7, 5, 18])}

pd.DataFrame(series)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
水果 数量 价格
0 苹果 60 7
1 50 5
2 草莓 100 18

参考文献

https://mp.weixin.qq.com/s?src=11&timestamp=1630639469&ver=3291&signature=2hP6UP*xyIiph4dVB7QKEtEbmKdsacG8sFuoIeSYBuRFZ*tDDJPxkb21KefUeBiw7chpJcCW-FnbOtMcfvdy*QpOpjHZzjK0yZFnTKiCPvpn4Cy3H2imaKiJna0nM2J3&new=1