Home Basic Visualization
Post
Cancel

Basic Visualization

1
2
3
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
1
df = pd.read_csv("train.csv")

Categorical Plots

Factorplot

  • seaborn.factorplot(x, y, hue, data, row, col…, aspect, size,…)
  • x,y: Column 이름
  • hue (option): Color encoding을 적용할 Column 이름
  • data: Dataframe
  • aspect(option): 실수, 가로/세로 비율
1
sns.factorplot(x="Pclass", y="Survived", hue="Sex", data=df, aspect=0.9, size=3.5)

1
sns.factorplot(x="Pclass", y="Survived", data=df, aspect=0.9, size=3.5)

1
sns.factorplot(x="Embarked", y="Survived", hue="Sex", data=df)

Countplot

  • 각 카테고리 값 별로 데이터가 얼마나 있는지 표시(변수의 발생 횟수)
  • seaborn.countplot(x=”column_name”, data=dataframe)
1
2
3
ax = sns.countplot(x="Sex", hue="Survived", palette="Set1", data=df)
ax.set(title="Survivors accoring to sex", xlabel="Sex",ylabel="Total")
plt.show()

1
2
3
sns.countplot(x="Pclass", data=df, palette = "Set2")
plt.title("Numbers of PClass")
plt.show()

1
2
3
sns.countplot(x="Pclass", hue = 'Survived',data=df)
plt.title("Numbers of PClass")
plt.show()

1
df.head()
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS

FacetGrid

  • FacetGrid에 데이터프레임과 row, col, hue 등 전달해 객체 생성
  • 객체(facet)의 map 함수에 적용할 그래프의 종류와 Column 전달
  • outelier 데이터 확인 가능
1
2
graph = sns.FacetGrid(df, col="Survived")
graph.map(plt.hist, "Fare", bins=20) # 각 서브 플롯에 적용할 그래프 종류를 map() 이용하여 그리드 객체에 전달

1
2
graph = sns.FacetGrid(df, col="Sex")
graph.map(plt.hist, "Fare", bins=20, color ='r')

1
2
graph = sns.FacetGrid(df, col="Sex", row = "Survived")
graph = graph.map(plt.hist, "Fare", bins=20, color ='y')

1
2
graph = sns.FacetGrid(df, col="Sex", hue = "Survived", size = 4)
graph = graph.map(plt.hist, "Fare", bins=20)

  • 히스토그램 뿐만 아니라 아래와 같이 regplot을 이용하여 시각화 가능
  • 색깔 별로 어떤 값을 나타내는지 legend 추가(범례 추가)
1
2
3
graph = sns.FacetGrid(df, col="Sex", hue = "Survived", size = 4)
graph = graph.map(sns.regplot, "Fare", 'Age',fit_reg=False)
graph=graph.add_legend()

  • X축, Y축 범위 추가
1
2
3
4
graph = sns.FacetGrid(df, col="Sex", hue = "Survived", size = 4)
graph = graph.map(sns.regplot, "Fare", 'Age',fit_reg=False)
graph=graph.add_legend()
graph.set(xlim = (1,300), ylim=(0,100))