1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
<span role="presentation"><span class="cm-keyword">DataFrame 列の追加とindexの追加 縦に増やす import pandas as pd index = ["a", "b", "c", "d", "e"] data1 = [10, 5, 8, 12, 3] data2 = [30, 25, 12, 10, 8] series1 = pd.Series(data1, index=index) series2 = pd.Series(data2, index=index) new_column = pd.Series([15, 7], index=[0, 1]) #縦(カラム)に15と7を代入、indexで15を縦の0の位置に入れ、1を縦の1の位置に代入 # series1, seires2からDataFrameを生成 df = pd.DataFrame([series1, series2]) # dfの新しい列"f"にnew_columnのデータを追加 df["f"] = new_column # 出力 print(df)</span></span> |
Pandas
DataFrame 行の追加とindexの追加 横に増やす
1 2 3 4 5 6 7 8 9 |
data = {"fruits": ["a", "b", "c", "d", "e"], "year": [2001, 2002, 2001, 2008, 2006], "time": [1, 4, 5, 6, 3]} df = pd.DataFrame(data) series = pd.Series(["f", 2008, 7], index=["fruits", "year", "time"]) #上記のindex=[...]が無いと余計な行が追加される、追加した要素が どのindexに属するのかを指定する必要がある df = df.append(series, ignore_index=True) |
1 2 |
出力結果
1 2 3 4 5 6 7 |
fruits time year 0 a 1 2001 1 b 4 2002 2 c 5 2001 3 d 6 2008 4 e 3 2006 5 f 7 2008 |
DataFrame 行が(縦)index 列が(横)カラム
DataFrameの行の名前をインデックス(横) 指定した要素の名前、数だけ右に向かって出力
DataFrameの列の名前をカラム(縦)0~nまで要素の数だけ縦に自動で付記
DataFrameは、Seriesを複数束る2次元のデータ構造。
pandas.DataFrame()
にSeriesを渡し、DataFrameを生成行には0から昇順に番号がつきます。
1 2 |
<span class="cm-variable">pandas</span>.<span class="cm-property">DataFrame</span>([<span class="cm-variable">Series</span>, <span class="cm-variable">Series</span>, ...]) |
バリューにリストの辞書型を用いても作成可能
リスト型の長さは等しくする
コード
1 2 3 4 5 6 |
<span class="cm-variable">data</span> = {<span class="cm-string">"fruits"</span>: [<span class="cm-string">"apple"</span>, <span class="cm-string">"orange"</span>, <span class="cm-string">"banana"</span>, <span class="cm-string">"strawberry"</span>, <span class="cm-string">"kiwifruit"</span>], <span class="cm-string">"year"</span>: [<span class="cm-number">2001</span>, <span class="cm-number">2002</span>, <span class="cm-number">2001</span>, <span class="cm-number">2008</span>, <span class="cm-number">2006</span>], <span class="cm-string">"time"</span>: [<span class="cm-number">1</span>, <span class="cm-number">4</span>, <span class="cm-number">5</span>, <span class="cm-number">6</span>, <span class="cm-number">3</span>]} <span class="cm-variable">df</span> = <span class="cm-variable">pd</span>.<span class="cm-property">DataFrame</span>(<span class="cm-variable">data</span>) <span class="cm-builtin">print</span>(<span class="cm-variable">df</span>) |
出力結果
1 2 3 4 5 6 7 8 9 10 11 |
fruits time year 0 apple 1 2001 1 orange 4 2002 2 banana 5 2001 3 strawberry 6 2008 4 kiwifruit 3 2006 例 |
1 |
<span role="presentation"><span class="cm-keyword">import</span> <span class="cm-variable">pandas</span> <span class="cm-keyword">as</span> <span class="cm-variable">pd</span></span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-variable">index</span> = [<span class="cm-string">"a"</span>, <span class="cm-string">"b"</span>, <span class="cm-string">"c"</span>, <span class="cm-string">"d"</span>, <span class="cm-string">"e"</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">data1</span> = [<span class="cm-number">10</span>, <span class="cm-number">5</span>, <span class="cm-number">8</span>, <span class="cm-number">12</span>, <span class="cm-number">3</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">data2</span> = [<span class="cm-number">30</span>, <span class="cm-number">25</span>, <span class="cm-number">12</span>, <span class="cm-number">10</span>, <span class="cm-number">8</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">series1</span> = <span class="cm-variable">pd</span>.<span class="cm-property">Series</span>(<span class="cm-variable">data1</span>, <span class="cm-variable">index</span>=<span class="cm-variable">index</span>)</span> |
1 |
<span role="presentation"><span class="cm-variable">series2</span> = <span class="cm-variable">pd</span>.<span class="cm-property">Series</span>(<span class="cm-variable">data2</span>, <span class="cm-variable">index</span>=<span class="cm-variable">index</span>)</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-comment"># series1, seires2からDataFrameを生成してdfに代入</span></span> |
1 |
1 |
<span role="presentation"><span class="cm-variable">df</span> = <span class="cm-variable">pd</span>.<span class="cm-property">DataFrame</span>([<span class="cm-variable">series1</span>,<span class="cm-variable">series2</span>])</span> |
1 |
<span role="presentation"><span class="cm-comment"># 出力</span></span> |
1 |
<span role="presentation"><span class="cm-builtin">print</span>(<span class="cm-variable">df</span>)</span> |
1 |
Serius でソート
1 |
<span role="presentation"><span class="cm-keyword">import</span> <span class="cm-variable">pandas</span> <span class="cm-keyword">as</span> <span class="cm-variable">pd</span></span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-variable">index</span> = <span class=" CodeMirror-matchingbracket">[</span><span class="cm-string">"a"</span>, <span class="cm-string">"b"</span>, <span class="cm-string">"c"</span>, <span class="cm-string">"d"</span>, <span class="cm-string">"e"</span><span class=" CodeMirror-matchingbracket">]</span></span> |
1 |
<span role="presentation"><span class="cm-variable">data</span> = [<span class="cm-number">10</span>, <span class="cm-number">5</span>, <span class="cm-number">8</span>, <span class="cm-number">12</span>, <span class="cm-number">3</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">series</span> = <span class="cm-variable">pd</span>.<span class="cm-property">Series</span>(<span class="cm-variable">data</span>, <span class="cm-variable">index</span>=<span class="cm-variable">index</span>)</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-comment"># seriesをインデックスについてアルファベット順にソートしたSeriesをitems1に代入にしてください。</span></span> |
1 |
<span role="presentation"><span class="cm-variable">items1</span> = <span class="cm-variable">series</span>.<span class="cm-property">sort_index</span>()</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-comment"># seriesをデータについて値の大きさを昇順にソートしたSeriesをitems2に代入してください。</span></span> |
1 |
<span role="presentation"><span class="cm-variable">items2</span> = <span class="cm-variable">series</span>.<span class="cm-property">sort_values</span>()</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-builtin">print</span>(<span class="cm-variable">items1</span>)</span> |
1 |
<span role="presentation"><span class="cm-builtin">print</span>()</span> |
1 |
<span role="presentation"><span class="cm-builtin">print</span>(<span class="cm-variable">items2</span>)</span> |
Series 値の数値以上、以下 指定範囲で取り出し
1 |
<span role="presentation"><span class="cm-keyword">import</span> <span class="cm-variable">pandas</span> <span class="cm-keyword">as</span> <span class="cm-variable">pd</span></span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-variable">index</span> = [<span class="cm-string">"a"</span>, <span class="cm-string">"b"</span>, <span class="cm-string">"c"</span>, <span class="cm-string">"d"</span>, <span class="cm-string">"e"</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">data</span> = [<span class="cm-number">10</span>, <span class="cm-number">5</span>, <span class="cm-number">8</span>, <span class="cm-number">12</span>, <span class="cm-number">3</span>]</span> |
1 |
<span role="presentation"><span class="cm-variable">series</span> = <span class="cm-variable">pd</span>.<span class="cm-property">Series</span>(<span class="cm-variable">data</span>, <span class="cm-variable">index</span>=<span class="cm-variable">index</span>)</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-comment"># series内の要素のうち、値が5以上10未満の要素を含むSeriesを作り、seriesに再代入</span></span> |
1 |
<span role="presentation"><span class="cm-variable">series</span> = <span class="cm-variable">series</span>[<span class="cm-variable">series</span> <span class="cm-operator">></span>= <span class="cm-number">5</span>][<span class="cm-variable">series</span> <span class="cm-operator"><</span> <span class="cm-number">10</span>]</span> |
1 |
<span role="presentation"></span> |
1 |
<span role="presentation"><span class="cm-builtin">print</span><span class=" CodeMirror-matchingbracket">(</span><span class="cm-variable">series</span><span class=" CodeMirror-matchingbracket">)</span></span> |
series要素の削除
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import pandas as pd index = ["a", "b", "c", "d", "e"] data = [10, 5, 8, 12, 3] # indexとdataを含むSeriesを作成しseriesに代入 series = pd.Series(data, index=index) # インデックスがstrawberryの要素を削除してseriesに代入 series = series.drop("c") print(series) 結果 a 10 b 5 d 12 e 3 dtype: int64 |