python - Find and Edit Text File -


i'm looking find if there way of automating process. have 300,000 rows of data needed download on daily basis. there couple of rows need edited before can uploaded sql.

jordan || michael | 23 | bulls | chicago  bryant | kobe ||| 8 || la 

what want accomplish have 4 vertical bars per row. normally, search keyword edit manually save. these 2 anomalies in data.

  1. find "jordan", remove excess 1 vertical bar "|" right after it.
  2. i need find "kobe", remove 2 excess vertical bars "|" right after it.

correct format below -

jordan | michael | 23 | bulls | chicago  bryant | kobe | 8 || la 

not sure if can done in vbscript or python. appreciated. thanks!

python or vbscript used overkill simple. try sed:

$ sed -e 's/(jordan *)\|/\1/g; s/(kobe *)\| *\|/\1/g' file  jordan | michael | 23 | bulls | chicago bryant | kobe | 8 || la 

to save new file:

sed -e 's/(jordan *)\|/\1/g; s/(kobe *)\| *\|/\1/g' file >newfile 

or, change existing file in-place:

sed -ei.bak 's/(jordan *)\|/\1/g; s/(kobe *)\| *\|/\1/g' file  

how works

sed reads , processes file line line. in our case, need substitute command has form s/old/new/g old regular expression and, if found, replaced new. optional g @ end of command tells sed perform substitution command 'globally', meaning not once many times appears on line.

  • s/(jordan *)\|/\1/g

    this tells sed jordan followed 0 or more spaces followed vertical bar , remove vertical bar.

    in more detail, parens in (jordan *) tell sed save string jordan followed 0 or more spaces group. in replacement side, reference group \1.

  • s/(kobe *)\| *\|/\1/g

    similarly, tells sed kobe followed 0 or more spaces followed vertical bar , remove vertical bar.

using python

using same logic above, here python program:

$ cat kobe.py import re open('file') f:     line in f:         line = re.sub(r'(jordan *)\|', r'\1', line)         line = re.sub(r'(kobe *)\| *\|', r'\1', line)         print(line.rstrip('\n')) $ python kobe.py jordan | michael | 23 | bulls | chicago bryant | kobe | 8 || la 

to save new file:

python kobe.py >newfile 

Comments

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

java - How to resolve The method toString() in the type Object is not applicable for the arguments (InputStream) -