autogluon으로 최적화 모델링 만들기

IT/파이썬

autogluon으로 최적화 모델링 만들기

sarah0518 2021. 4. 23. 22:04

728x90

바로 이 전 글에서는 autogluon설치 해결 이슈에 대해 다뤘으니,

오늘은 autogluon이 무엇이고,

어떻게 사용하는지에 대해서 적어 보려고 합니다!

autogluon은 다양한 알고리즘으로, customized parameter 범위내에서

최적의 알고리즘을 select 해주는 기법이라고 간단히 생각하시면 될 거같습니다.

제가 돌렸을 때 autogluon에서 사용했던 알고리즘들은

NeuralNetFastAI
KNeighborDist
WeightedEnsemble_L2
CatBoost
LightGBM
ExtraTressEntr
ExtraTressGini
KneighborsUnif
RandomForestGini
LightGBMLarge....

등등 다양한 알고리즘을 사용해줬습니다. (자동으로)

그리고, 각 알고리즘별 tuning parameter들을 세팅할 수 있는데,

hyperparameter에 대한 customize는 GBM, NN, RF, CAT에서만 가능합니다.

이제는!!

사용법을 알아볼게요.

매우 간단해요.

1
2
3
4
5
6
7

from autogluon.tabular import TabularPredictor
 
df=pd.read_csv('df.csv', encoding='cp949')
train10, test10 =  train_test_split(df, train_size = 0.7, random_state =333,stratify=df['target'])
 
# autogluon 
predictor = TabularPredictor(label='target').fit(train_data=train10[['val1', 'val2','val3', 'val4','target']])

cs

위의 코드를 돌리면 어떤 Fitting 모델들을 사용했는지를 확인할 수 있고,

predictor값에는 예측된 모형의 다양한 directory값들이 들어있습니다.

predictor의 summary 값을 확인해 보시려면 아래와 같은 코드 한줄이면 됩니다.

그 다음은 test데이터로 예측을 하는 코드입니다.

1

prediction=predictor.predict(test10[['val1', 'val2', 'val3', 'val4']])

cs

autogluon으로 fitted model 의 성능을 확인하기 위해 Confusion matrix를 확인해보면

1
2
3
4

import sklearn.metrics as metrics
cm = metrics.confusion_matrix(test10[['target']], prediction)
print(cm)
 
Colored by Color Scripter

cs

32명의 암환자중 25명을 잘 맞춘 것으로 확인할 수 있겠네요!

그렇다면 어떤 모델이 가장 성능이 좋았는지 확인해보려면

1

predictor.get_model_best()

cs

위와 같은 코드 한줄이면 됩니다.

autogluon에 대한 더 많은 정보는 아래 사이트를 참조하셔요.

auto.gluon.ai/stable/index.html

AutoGluon: AutoML for Text, Image, and Tabular Data — AutoGluon Documentation 0.1.0 documentation

auto.gluon.ai

추가로 autogluon에 대한 library설치 이슈는 이전 글 참조하시면 됩니다.
favoritethings.tistory.com/32

728x90

'IT > 파이썬' 카테고리의 다른 글

[for loop 활용] 최적화된 feature set 찾기 (0)	2021.06.25
차원축소 tsne, pca와 비교 (0)	2021.04.25
autogluon 설치 에러 이슈 해결방법 (0)	2021.04.22
선형보간법(interpolation)으로 결측치 채우기 (0)	2021.04.21
데이터 추출하기 - 정규분포 vs. 임의복원추출 (0)	2021.04.19

현재글autogluon으로 최적화 모델링 만들기

sarah0518