How to extract specific columns from a space separated file in Python? -
i'm trying process file protein data bank separated spaces (not \t). have .txt file , want extract specific rows and, rows, want extract few columns.
i need in python. tried first command line , used awk command no problem, have no idea of how same in python.
here extract of file:
[...] seqres 6 b 80 ala leu ser ile lys lys ala gln thr pro gln gln trp seqres 7 b 80 lys pro helix 1 1 thr 68 ser 81 1 14 helix 2 2 cys 97 leu 110 1 14 helix 3 3 asn 122 ser 133 1 12 [...]
for example, i'd take 'helix' rows , 4th, 6th, 7th , 9th columns. started reading file line line loop , extracted rows starting 'helix'... , that's all.
edit: code have right now, print doesn't work properly, prints first line of each block (helix sheet , dbref)
#!/usr/bin/python import sys line in open(sys.argv[1]): if 'helix' in line: helix = line.split() elif 'sheet'in line: sheet = line.split() elif 'dbref' in line: dbref = line.split() print (helix), (sheet), (dbref)
if have extracted line, can split using line.split()
. give list, of can extract elements need:
>>> test='helix 2 2 cys 97' >>> test.split() ['helix', '2', '2', 'cys', 'a', '97'] >>> test.split()[3] 'cys'
Comments
Post a Comment