python_pandas_9 - 8BitsCoding/RobotMentor GitHub Wiki

유투부

import pandas as pd

student_list = [{'name': 'John', 'major': "Computer Science", 'sex': "male"},
                {'name': 'Nate', 'major': "Computer Science", 'sex': "male"},
                {'name': 'Edward', 'major': "Computer Science", 'sex': "male"},
                {'name': 'Zara', 'major': "Psychology", 'sex': "female"},
                {'name': 'John', 'major': "Computer Science", 'sex': "male"}]

df = pd.DataFrame(student_list, columns = ['name', 'major', 'sex'])

df.duplicated()

어디서 중복된 값이 존재하는지 표시

df.drop_duplicates()

중복된 데이터는 제거

df.duplicated(['name'])

name열에 중복이 있을경우 표시

df.drop_duplicates(['name'], keep='first'#'last')

name열에 중복이 있을경우 제거