Seaborn繪圖

直方圖

在Seaborn繪制直方圖有三種方式，一個是使用Figure層面的函數(shù)，一個是使用Axes層面的函數(shù)，另一個是使用object層面的函數(shù)。

先看Axes層面的繪制直方圖的函數(shù)histplot()。執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins,x='flipper_length_mm',ax=ax)
plt.show()

程序輸出的結果見下圖。

繪制直方圖主要的參數(shù)是分組間隔、或者是組距，分別通過參數(shù)bins和binwidth來設置，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme(style='ticks')
fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins,x='flipper_length_mm',bins=20,ax=ax)
plt.show()

程序輸出的結果見下圖。

這是直接指定了分組組數(shù)，我們也可以指定組距，也就是等距分組，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins,x='flipper_length_mm',color='aquamarine',binwidth=8,ax=ax)
plt.show()

程序輸出的結果見下圖。

使用參數(shù)binrange可以限制繪制直方圖的變量取值范圍，傳遞一個列表或者元組即可。執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins,x='flipper_length_mm',color='mediumseagreen',
             bins=15,binrange=(200,240),ax=ax)
plt.show()

程序輸出的結果見下圖。

如果我們想要繪制一個橫向的直方圖，應該傳遞什么參數(shù)嗎？不需要，只需要指定變量在Y軸即可，執(zhí)行下面的命令：

import seaborn as sns
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins, y='flipper_length_mm',color='thistle',ax=ax)
plt.show()

程序輸出的結果見下圖。

在繪制直方圖時，還可以添加核密度估計曲線，通過參數(shù)kde指定，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
penguins = sns.load_dataset('penguins')
sns.histplot(data=penguins, x='flipper_length_mm', kde=True, 
             color='darkseagreen', ax=ax)
plt.show()

程序輸出的結果見下圖。

如果你想要調整下窗寬，在估計概率密度函數(shù)時?？梢允褂脜?shù)kde_kws，具體地可以參考下面的核密度估計函數(shù)kdeplot()。

如果我們想要繪制多個直方圖在一張圖中，很簡單傳遞整個數(shù)據集即可，它默認把所有列作為變量分別繪制直方圖，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.histplot(data=iris, palette='Set2', ax=ax)
plt.show()

程序輸出的結果見下圖。

上面都是對所有數(shù)據的直方圖繪制，要是我們想要針對某個變量進行分組數(shù)據下的直方圖估計呢？還是使用hue參數(shù)即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.histplot(data=iris, x='sepal_width', palette='twilight', hue='species', ax=ax)
plt.show()

程序輸出的結果見下圖。

這是并列繪制的多個直方圖，我們能不能繪制堆疊形式的直方圖呢？當然可以的，通過參數(shù)multiple指定即可。執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.histplot(data=iris, x='sepal_width', palette='turbo', hue='species', 
             bins=15, multiple='stack',ax=ax)
plt.show()

程序輸出的結果見下圖。

上面的直方圖都是以bar的形式來繪制的，也就是類似于條形圖，這個可以改變不？當然可以，通過參數(shù)element即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.histplot(data=iris, x='sepal_width', palette='rainbow', hue='species', 
             bins=15, element='step',ax=ax)
plt.show()

程序輸出的結果見下圖。

上面的直方圖繪制的指標都是頻數(shù)count，能不能繪制頻率呢？當然可以，使用參數(shù)stat即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.histplot(data=tips, x='total_bill', palette='ocean', hue='sex', 
             bins=15, stat='frequency',ax=ax)
plt.show()

程序輸出的結果見下圖。

Frequency表示不是我們統(tǒng)計中常用的那個頻率，它是指頻數(shù)除以組距，如果我們想要獲取頻率和組距的比值呢？指定參數(shù)stat為'probability'？執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.histplot(data=tips, x='total_bill', palette='YlGnBu', hue='sex', 
             binwidth=2, stat='probability',ax=ax)
plt.show()

程序輸出的結果見下圖。

實際上，指定stat為probability并不是說繪制的統(tǒng)計量是頻率除以組距，而直接是指頻率，它的解釋是使得條形高度之和為1，這便是說指標體系為頻率。

那我們怎樣子才能使得繪制指標為頻率除以組距呢？也就是說我們想要直方圖的面積等于1，每一個分組下的條形面積表示頻率，這可以說是直方圖的經典定義了。只需要指定參數(shù)stat為density即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.histplot(data=tips, x='total_bill', palette='YlOrBr', hue='sex', 
             binwidth=2.5, stat='density',ax=ax)
plt.show()

程序輸出的結果見下圖。

這便是真正地使用直方圖來估計概率密度的圖示方法。

如果我們需要繪制離散型數(shù)據的直方圖(實際上就應該是柱狀圖了)，只需要指定參數(shù)discrete為True即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.histplot(data=tips, x='day', color='palegreen', discrete=True, shrink=0.6, ax=ax)
plt.show()

程序輸出的結果見下圖。

如果我們想要繪制兩個變量的直方圖，并排繪制，通過參數(shù)multiple指定為'dodge'，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.histplot(data=tips, x='day', hue='sex', multiple='dodge', shrink=.8)
plt.show()

程序輸出的結果見下圖。

有時候我們需要繪制對數(shù)化的直方圖，這也很簡單，只需要指定參數(shù)log_scale為True即可，默認是以10為底數(shù)的對數(shù)化，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
planets = sns.load_dataset('planets')
sns.histplot(data=planets, x='distance', log_scale=True, ax=ax, color='peachpuff')
plt.show()

程序輸出的結果見下圖。

如果不想要填充顏色，設置參數(shù)fill為False即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
planets = sns.load_dataset('planets')
sns.histplot(data=planets, x='distance', log_scale=True, fill=False,ax=ax, color='black')
plt.show()

程序輸出的結果見下圖。

還可以繪制累積直方圖，那么當我們指定stat為probability時，得到的累積直方圖不就是經驗累積分布函數(shù)嗎？執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
planets = sns.load_dataset('planets')
sns.histplot(data=planets, x='distance', hue='method',
             hue_order=['Radial Velocity', 'Transit'], element='step', 
             fill=False, cumulative=True,stat='probability',log_scale=True, ax=ax)
plt.show()

程序輸出的結果見下圖。

你會發(fā)現(xiàn)，累積直方圖顯然不是經驗累積分布函數(shù)，因為最后的求和都不是1，這是為什么呢？首先，這二者根本不是一個概念，累積直方圖只是將直方圖進行累積求和，而經驗分布函數(shù)是基于秩來考慮的。

那為什么這里我們指定stat為probability后求和不為一呢？這是因為我們分組了，而且這個分組變量的類別數(shù)也不只有兩類，只是我們通過參數(shù)order_hue指定需要的繪制兩個罷了。

實際上，繪制經驗分布有專門的函數(shù)ecdfplot()。

如果我們想要繪制兩個變量的聯(lián)合直方圖呢？只需要指定x，y兩個參數(shù)即可，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
planets = sns.load_dataset('planets')
sns.histplot(data=planets, x='distance', y='mass', stat='frequency',
             color='violet', cbar=True, ax=ax)
plt.show()

程序輸出的結果見下圖。

通過指定cbar為True，繪制了顏色bar圖。

有關函數(shù)histplot()的更多用法請參考這里

https://seaborn.pydata.org/generated/seaborn.histplot.html

再看Figure層面的直方圖函數(shù)，displot()。它的基本用法和histplot函數(shù)是一樣的，只需要指定參數(shù)kind='hist'即可。執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
penguins = sns.load_dataset('penguins')
fig = sns.displot(data=penguins, kind='hist', x='flipper_length_mm', y='bill_length_mm',
                  color='#808080', cbar=True, height=6, aspect=1)
plt.show()

程序輸出的結果見下圖。

關于繪制直方圖的Figure層面函數(shù)displot的更多用法請參考這里

https://seaborn.pydata.org/generated/seaborn.displot.html

再看第三種繪制直方圖的方式，使用object方法，執(zhí)行代碼如下：

import seaborn as sns
import seaborn.objects as so

sns.set_theme() # 設置seaborn繪圖風格
fmri = sns.load_dataset('fmri')
(
so.Plot
(
data=fmri, x='signal', color='region'
)
.add
(
so.Bars(), so.Hist(stat='density',binwidth=0.05),so.Dodge()
)
.layout(size=(6,6))
.show()
)

程序輸出的結果見下圖。

三種方法中的object方法是最繁瑣的，一般沒有特別地需求是不會使用它的。前兩個方法足以滿足我們的需求了。

核密度估計曲線圖

先看Axes層面的函數(shù)kdeplot()，主要的參數(shù)是窗寬bw_adjust，核估計方法一般是高斯核函數(shù)。其他的參數(shù)多為分組變量。

如果我們要繪制某單變量的核密度估計曲線圖，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.kdeplot(data=tips, x='total_bill', color='#c79fef', ax=ax)
plt.show()

程序輸出的結果見下圖。

給定數(shù)據集，同時繪制多列變量的核密度估計曲線，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.kdeplot(data=iris, palette='cool', ax=ax)
plt.show()

程序輸出的結果見下圖。

給定一個分組變量，也可以繪制多個核密度估計曲線了，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.kdeplot(data=iris, x='petal_length', palette='cool', hue='species', ax=ax)
plt.show()

程序輸出的結果見下圖。

使用參數(shù)fill來填充曲線下面的面積部分，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.kdeplot(data=tips, x='total_bill', hue='size', fill=True, common_norm=False, 
            palette='PRGn', alpha=0.5, linewidth=0, ax=ax)
plt.show()

程序輸出的結果見下圖。

這里之所以要指定common_norm=False，是為了使得每一個類別下的概率密度曲線的面積為1，而不是所有的密度曲線下面的面積為1。

再看二維核密度估計曲線，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.kdeplot(data=iris, x='petal_length', y='petal_width', bw_adjust=0.5, 
            color='yellowgreen',  alpha=0.6, ax=ax)
plt.show()

程序輸出的結果見下圖。

有關函數(shù)kdeplot()的更多參數(shù)使用請參考這里

https://seaborn.pydata.org/generated/seaborn.kdeplot.html

再看Figure層面的核密度估計曲線函數(shù)，還是displot()，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
geyser = sns.load_dataset('geyser')
fig = sns.displot(data=geyser, kind='kde', x='waiting', y='duration', hue='kind', 
                  fill=True, palette='viridis', height=6, aspect=1)
plt.show()

程序輸出的結果見下圖。

這個核密度估計曲線就沒有第三種方式object函數(shù)來實現(xiàn)了。

經驗分布函數(shù)

繪制經驗分布函數(shù)圖，使用函數(shù)ecdfplot()即可。

先看單一變量的經驗分布函數(shù)圖，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.ecdfplot(data=tips, x='total_bill', color='tab:olive', alpha=0.4, ax=ax)
plt.show()

程序輸出的結果見下圖。

將多個總體的經驗分布函數(shù)繪制到同一張圖中，使用分組變量hue即可，執(zhí)行代碼如下：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
iris = sns.load_dataset('iris')
sns.ecdfplot(data=iris, x='petal_length', hue='species', 
             palette='inferno', alpha=0.7, ax=ax)
plt.show()

程序輸出的結果見下圖。

ecdfplot函數(shù)有關更多的參數(shù)使用請參考這里

https://seaborn.pydata.org/generated/seaborn.ecdfplot.html

下面看Figure層面的繪圖函數(shù)使用，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
geyser = sns.load_dataset('geyser')
fig = sns.displot(data=geyser, kind='ecdf', x='waiting', hue='kind', 
                  palette='cividis', height=6, aspect=1)
plt.show()

程序輸出的結果見下圖。

毛毯圖

所謂毛毯圖就是在軸附近繪制很多短豎線，豎線的密集程度表示分布的密集程度。使用函數(shù)rugplot完成的。

繪制某個變量的地毯圖，執(zhí)行代碼如下：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.rugplot(data=tips, x='tip', ax=ax)
plt.show()

程序輸出的結果見下圖。

其實，一般不單獨地繪制地毯圖的，而是配合散點圖或者直方圖或核密度估計曲線來繪制的。

散點圖配合地毯圖，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.scatterplot(data=tips, x='tip', y='total_bill', color='#DDA0DD', alpha=0.6, ax=ax)
sns.rugplot(data=tips, x='tip', ax=ax)
sns.rugplot(data=tips, y='total_bill', ax=ax)
plt.show()

程序輸出的結果見下圖。

下面再繪制核密度估計曲線和地毯圖配合，執(zhí)行下面的代碼：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
fig, ax = plt.subplots(figsize=(6,6))
tips = sns.load_dataset('tips')
sns.kdeplot(data=tips, x='tip', color='#DFA099', alpha=0.6, ax=ax)
sns.rugplot(data=tips, x='tip', height=-0.01, clip_on=False, ax=ax)
plt.show()

程序輸出的結果見下圖。

有關函數(shù)rugplot更多的參數(shù)用法請參考這里

https://seaborn.pydata.org/generated/seaborn.rugplot.html

Figure層面的函數(shù)仍然使用displot，只需要指定參數(shù)rug=True即可，繪制地毯圖如下：

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme()
penguins = sns.load_dataset('penguins')
fig = sns.displot(penguins, kind='kde', x='bill_length_mm', y='bill_depth_mm',
                  rug=True, color='#008080', alpha=0.7, height=6, aspect=1)
plt.show()

程序輸出的結果見下圖。

第三種方式object函數(shù)是沒有的，沒有現(xiàn)成的接口調用繪制地毯圖。

本站僅提供存儲服務，所有內容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權內容，請點擊舉報。

国产一级a片免费看高清,亚洲熟女中文字幕在线视频,黄三级高清在线播放,免费黄色视频在线看

Seaborn繪圖

直方圖

核密度估計曲線圖

經驗分布函數(shù)

毛毯圖