Страницы

Поиск по вопросам

четверг, 19 декабря 2019 г.

объединение строк в Pandas

#python #pandas #dataframe #pivot


          Win_Status Reference_Name
Match                              
957116906     Winner           Agni
957116906     Winner       Poseidon
957116906     Winner         Olorun
957116906     Winner        Guan Yu
957116906     Winner           Nike
957116906      Loser         Awilix
957116906      Loser          Cupid
957116906      Loser          Hades
957116906      Loser           Zeus
957116906      Loser      Xing Tian
957116922     Winner        Nemesis
957116922     Winner           Hera
957116922     Winner         Apollo
957116922     Winner      Aphrodite
957116922     Winner       Hercules
957116922      Loser          Chaac
957116922      Loser           Zeus
957116922      Loser       Hun Batz
957116922      Loser           Rama
957116922      Loser        Bacchus
957116921     Winner           Hera
957116921     Winner        Nemesis
957116921     Winner    King Arthur
957116921     Winner           Ymir
957116921     Winner      Cernunnos
957116921      Loser         Olorun
957116921      Loser         Fenrir
957116921      Loser         Anubis
957116921      Loser          Horus
957116921      Loser       Achilles
957116911     Winner       Cabrakan
957116911     Winner          Nu Wa
957116911     Winner          Skadi
957116911     Winner        Izanami
957116911     Winner      Ratatoskr
957116911      Loser         Apollo
957116911      Loser        Arachne
957116911      Loser         Scylla
957116911      Loser          Horus
957116911      Loser         Merlin
957116913     Winner    Erlang Shen
957116913     Winner         Vulcan
957116913     Winner           Ymir
957116913     Winner        Chronos
957116913     Winner    King Arthur
957116913      Loser       Poseidon
957116913      Loser           Odin
957116913      Loser          Neith
957116913      Loser        Bacchus
957116913      Loser           Ares
957116928     Winner          Freya
957116928     Winner       Camazotz
957116928     Winner          Horus
957116928     Winner    Erlang Shen
957116928     Winner      Xing Tian
957116928      Loser          He Bo
957116928      Loser         Raijin
957116928      Loser       Cabrakan
957116928      Loser          Anhur
957116928      Loser       Hercules
957116906     Winner           Agni
957116906     Winner       Poseidon
957116906     Winner         Olorun
957116906     Winner        Guan Yu
957116906     Winner           Nike
957116906      Loser         Awilix
957116906      Loser          Cupid
957116906      Loser          Hades
957116906      Loser           Zeus
957116906      Loser      Xing Tian
957116916     Winner        Izanami
957116916     Winner        Arachne
957116916     Winner      Amaterasu
957116916     Winner        Kuzenbo
957116916     Winner         Scylla
957116916      Loser         Khepri
957116916      Loser           Pele
957116916      Loser          Thoth
957116916      Loser           Isis
957116916      Loser             Ra
957116907     Winner           Hera
957116907     Winner        Mercury
957116907     Winner           Pele
957116907     Winner          Chaac
957116907     Winner   Ah Muzen Cab
957116907      Loser           Ymir
957116907      Loser         Merlin
957116907      Loser       Kukulkan
957116907      Loser        Bellona
957116907      Loser          Skadi
957116918     Winner       Hun Batz
957116918     Winner    King Arthur
957116918     Winner      Zhong Kui
957116918     Winner        Izanami
957116918     Winner           Ymir
957116918      Loser       Cerberus
957116918      Loser        Bellona
957116918      Loser    King Arthur
957116918      Loser          Janus
957116918      Loser          Neith


Есть CSV(30.тыс строк) файл с 10 одинаковыми индексами Match, подскажите как их можно
объединить в одну строку?
Получиться должно что-то вроде

Match     Winner1 Winner2  ... Loser1  Loser2  
957116906 Agni    Poseidon ... Awilix  Cupid
957116922 Nemesis Hera     ... Chaac   Zeus


Добавил количество входных данных т.к для индексов Match разных более 6(т.е row боле
60) появляются значения Nan в таблице и количество колонок становиться в 2 раза больше.
Loser6-10 и Winner6-10( они заполнены NaN)

Win_Status    Loser1    Loser10   Loser2       Loser3   Loser4     Loser5  Loser6
Loser7 Loser8 Loser9      Winner1 Winner10      Winner2      Winner3      Winner4 
     Winner5 Winner6   Winner7 Winner8  Winner9
Match                                                                           
                                                                                  
                                                
957116906     Awilix  Xing Tian    Cupid        Hades     Zeus  Xing Tian  Awilix
 Cupid  Hades   Zeus         Agni     Nike     Poseidon       Olorun      Guan Yu 
        Nike    Agni  Poseidon  Olorun  Guan Yu
957116907       Ymir        NaN   Merlin     Kukulkan  Bellona      Skadi     NaN
   NaN    NaN    NaN         Hera      NaN      Mercury         Pele        Chaac 
Ah Muzen Cab     NaN       NaN     NaN      NaN
957116911     Apollo        NaN  Arachne       Scylla    Horus     Merlin     NaN
   NaN    NaN    NaN     Cabrakan      NaN        Nu Wa        Skadi      Izanami 
   Ratatoskr     NaN       NaN     NaN      NaN
957116913   Poseidon        NaN     Odin        Neith  Bacchus       Ares     NaN
   NaN    NaN    NaN  Erlang Shen      NaN       Vulcan         Ymir      Chronos 
 King Arthur     NaN       NaN     NaN      NaN
957116916     Khepri        NaN     Pele        Thoth     Isis         Ra     NaN
   NaN    NaN    NaN      Izanami      NaN      Arachne    Amaterasu      Kuzenbo 
      Scylla     NaN       NaN     NaN      NaN
957116918   Cerberus        NaN  Bellona  King Arthur    Janus      Neith     NaN
   NaN    NaN    NaN     Hun Batz      NaN  King Arthur    Zhong Kui      Izanami 
        Ymir     NaN       NaN     NaN      NaN
957116921     Olorun        NaN   Fenrir       Anubis    Horus   Achilles     NaN
   NaN    NaN    NaN         Hera      NaN      Nemesis  King Arthur         Ymir 
   Cernunnos     NaN       NaN     NaN      NaN
957116922      Chaac        NaN     Zeus     Hun Batz     Rama    Bacchus     NaN
   NaN    NaN    NaN      Nemesis      NaN         Hera       Apollo    Aphrodite 
    Hercules     NaN       NaN     NaN      NaN
957116928      He Bo        NaN   Raijin     Cabrakan    Anhur   Hercules     NaN
   NaN    NaN    NaN        Freya      NaN     Camazotz        Horus  Erlang Shen 
   Xing Tian     NaN       NaN     NaN      NaN

    


Ответы

Ответ 1



Надо так: In [30]: res = (df .assign(Win_Status=df["Win_Status"] +(df.groupby(["Match", "Win_Status"]).cumcount()+1).astype(str)) .pivot_table(index="Match", columns="Win_Status", values="Reference_Name", aggfunc="first")) In [31]: res Out[31]: Win_Status Loser1 Loser2 Loser3 Loser4 Loser5 Winner1 Winner2 Winner3 Winner4 Winner5 Match 957116906 Awilix Cupid Hades Zeus Xing Tian Agni Poseidon Olorun Guan Yu Nike 957116922 Chaac Zeus Hun Batz Rama Bacchus Nemesis Hera Apollo Aphrodite Hercules

Ответ 2



Вариантов много. Сначала подготовь столбец Win_Status: def increment(x): return x + 1 def process(df): df['Win_Status'] += df.reset_index().index.map(increment).map(str) return df df = df.drop_duplicates().groupby(['Win_Status', 'Match']).apply(process) Теперь аггрегируй: с помощью groupby, затем unstack: df = df.reset_index().groupby(['Match', 'Win_Status'])['Reference_Name'].aggregate('first').unstack() с помощью pivot: df = df.pivot(index=df['Match'], columns='Win_Status')['Reference_Name'] с помощью pivot_table: df = df.pivot_table(values='Reference_Name', index=df['Match'], columns='Win_Status', aggfunc='first') Результат во всех трех случаях: Win_Status Loser1 Loser2 Loser3 Loser4 Loser5 Winner1 Winner2 Winner3 Winner4 Winner5 Match 957116906 Awilix Cupid Hades Zeus Xing Tian Agni Poseidon Olorun Guan Yu Nike 957116922 Chaac Zeus Hun Batz Rama Bacchus Nemesis Hera Apollo Aphrodite Hercules

Комментариев нет:

Отправить комментарий