listops¶
This module gathers list/line operations
after¶
- class textops.after(pattern, get_begin=False, key=None)¶
Extract lines after a patterns
Works like textops.before except that it will yields all lines from the input AFTER the given pattern has been found.
Parameters:
- pattern (str or regex or list) – start yielding lines after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines after the specified pattern
Examples
>>> ['a','b','c','d','e','f'] | after('c').tolist() ['d', 'e', 'f'] >>> ['a','b','c','d','e','f'] | after('c',True).tolist() ['c', 'd', 'e', 'f'] >>> input_text = [{'k':1},{'k':2},{'k':3},{'k':4},{'k':5},{'k':6}] >>> input_text | after('3',key='k').tolist() [{'k': 4}, {'k': 5}, {'k': 6}] >>> input_text >> after('3',key='k') [{'k': 4}, {'k': 5}, {'k': 6}]
afteri¶
- class textops.afteri(pattern, get_begin=False, key=None)¶
Extract lines after a patterns (case insensitive)
Works like textops.after except that the pattern is case insensitive.
Parameters:
- pattern (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines before the specified pattern
Examples
>>> ['a','b','c','d','e','f'] | after('C').tolist() [] >>> ['a','b','c','d','e','f'] | afteri('C').tolist() ['d', 'e', 'f'] >>> ['a','b','c','d','e','f'] | afteri('C',True).tolist() ['c', 'd', 'e', 'f'] >>> ['a','b','c','d','e','f'] >> afteri('C',True) ['c', 'd', 'e', 'f']
before¶
- class textops.before(pattern, get_end=False, key=None)¶
Extract lines before a patterns
Works like textops.between except that it requires only the ending pattern : it will yields all lines from the input text beginning until the specified pattern has been reached.
Parameters:
- pattern (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_end (bool) – if True : include the line matching the end pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines before the specified pattern
Examples
>>> ['a','b','c','d','e','f'] | before('c').tolist() ['a', 'b'] >>> ['a','b','c','d','e','f'] | before('c',True).tolist() ['a', 'b', 'c'] >>> input_text = [{'k':1},{'k':2},{'k':3},{'k':4},{'k':5},{'k':6}] >>> input_text | before('3',key='k').tolist() [{'k': 1}, {'k': 2}] >>> input_text >> before('3',key='k') [{'k': 1}, {'k': 2}]
beforei¶
- class textops.beforei(pattern, get_end=False, key=None)¶
Extract lines before a patterns (case insensitive)
Works like textops.before except that the pattern is case insensitive.
Parameters:
- pattern (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_end (bool) – if True : include the line matching the pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines before the specified pattern
Examples
>>> ['a','b','c','d','e','f'] | before('C').tolist() ['a', 'b', 'c', 'd', 'e', 'f'] >>> ['a','b','c','d','e','f'] | beforei('C').tolist() ['a', 'b'] >>> ['a','b','c','d','e','f'] | beforei('C',True).tolist() ['a', 'b', 'c'] >>> ['a','b','c','d','e','f'] >> beforei('C',True) ['a', 'b', 'c']
between¶
- class textops.between(begin, end, get_begin=False, get_end=False, key=None)¶
Extract lines between two patterns
It will search for the starting pattern then yield lines until it reaches the ending pattern. Pattern can be a string or a Regex object, it can be also a list of strings or Regexs, in this case, all patterns in the list must be matched in the same order, this may be useful to better select some part of the text in some cases.
between works for any kind of list of strings, but also for list of lists and list of dicts. In these cases, one can test only one column or one key but return the whole list/dict.
Parameters:
- begin (str or regex or list) – the pattern(s) to reach before yielding lines from the input
- end (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the begin pattern (Default : False)
- get_end (bool) – if True : include the line matching the end pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines between two patterns
Examples
>>> 'a\nb\nc\nd\ne\nf' | between('b','e').tostr() 'c\nd' >>> 'a\nb\nc\nd\ne\nf' | between('b','e',True,True).tostr() 'b\nc\nd\ne' >>> ['a','b','c','d','e','f'] | between('b','e').tolist() ['c', 'd'] >>> ['a','b','c','d','e','f'] >> between('b','e') ['c', 'd'] >>> ['a','b','c','d','e','f'] | between('b','e',True,True).tolist() ['b', 'c', 'd', 'e'] >>> input_text = [('a',1),('b',2),('c',3),('d',4),('e',5),('f',6)] >>> input_text | between('b','e').tolist() [('c', 3), ('d', 4)] >>> input_text = [{'a':1},{'b':2},{'c':3},{'d':4},{'e':5},{'f':6}] >>> input_text | between('b','e').tolist() [{'c': 3}, {'d': 4}] >>> input_text = [{'k':1},{'k':2},{'k':3},{'k':4},{'k':5},{'k':6}] >>> input_text | between('2','5',key='k').tolist() [{'k': 3}, {'k': 4}] >>> input_text = [{'k':1},{'k':2},{'k':3},{'k':4},{'k':5},{'k':6}] >>> input_text | between('2','5',key='v').tolist() [] >>> input_text = [('a',1),('b',2),('c',3),('d',4),('e',5),('f',6)] >>> input_text | between('b','e',key=0).tolist() [('c', 3), ('d', 4)] >>> input_text = [('a',1),('b',2),('c',3),('d',4),('e',5),('f',6)] >>> input_text | between('b','e',key=1).tolist() [] >>> s='''Chapter 1 ... ------------ ... some infos ... ... Chapter 2 ... --------- ... infos I want ... ... Chaper 3 ... -------- ... some other infos''' >>> print s | between('---',r'^\s*$').tostr() some infos >>> print s | between(['Chapter 2','---'],r'^\s*$').tostr() infos I want
betweeni¶
- class textops.betweeni(begin, end, get_begin=False, get_end=False, key=None)¶
Extract lines between two patterns (case insensitive)
Works like textops.between except patterns are case insensitive
Parameters:
- begin (str or regex or list) – the pattern(s) to reach before yielding lines from the input
- end (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the begin pattern (Default : False)
- get_end (bool) – if True : include the line matching the end pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines between two patterns
Examples
>>> ['a','b','c','d','e','f'] | between('B','E').tolist() [] >>> ['a','b','c','d','e','f'] | betweeni('B','E').tolist() ['c', 'd'] >>> ['a','b','c','d','e','f'] >> betweeni('B','E') ['c', 'd']
betweenb¶
- class textops.betweenb(begin, end, get_begin=False, get_end=False, key=None)¶
Extract lines between two patterns (includes boundaries)
Works like textops.between except it return boundaries by default that is get_begin = get_end = True.
Parameters:
- begin (str or regex or list) – the pattern(s) to reach before yielding lines from the input
- end (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the begin pattern (Default : False)
- get_end (bool) – if True : include the line matching the end pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines between two patterns
Examples
>>> ['a','b','c','d','e','f'] | betweenb('b','e').tolist() ['b', 'c', 'd', 'e'] >>> ['a','b','c','d','e','f'] >> betweenb('b','e') ['b', 'c', 'd', 'e']
betweenbi¶
- class textops.betweenbi(begin, end, get_begin=False, get_end=False, key=None)¶
Extract lines between two patterns (includes boundaries and case insensitive)
Works like textops.between except patterns are case insensitive and it yields boundaries too. That is get_begin = get_end = True.
Parameters:
- begin (str or regex or list) – the pattern(s) to reach before yielding lines from the input
- end (str or regex or list) – no more lines are yield after reaching this pattern(s)
- get_begin (bool) – if True : include the line matching the begin pattern (Default : False)
- get_end (bool) – if True : include the line matching the end pattern (Default : False)
- key (int or str) – test only one column or one key (optional)
Yields: str or list or dict – lines between two patterns
Examples
>>> ['a','b','c','d','e','f'] | betweenb('B','E').tolist() [] >>> ['a','b','c','d','e','f'] | betweenbi('B','E').tolist() ['b', 'c', 'd', 'e'] >>> ['a','b','c','d','e','f'] >> betweenbi('B','E') ['b', 'c', 'd', 'e']
cat¶
- class textops.cat(context={})¶
Return the content of the file with the path given in the input text
If a context dict is specified, the path is formatted with that context (str.format)
Parameters: context (dict) – The context to format the file path (Optionnal) Yields: str – the file content lines Examples
>>> open('/tmp/testfile.txt','w').write('here is the file content') >>> '/tmp/testfile.txt' | cat() <generator object extend_type_gen at ...> >>> '/tmp/testfile.txt' | cat().tostr() 'here is the file content' >>> '/tmp/testfile.txt' >> cat() ['here is the file content'] >>> '/tmp/testfile.txt' | cat().upper().tostr() 'HERE IS THE FILE CONTENT' >>> context = {'path':'/tmp/'} >>> '{path}testfile.txt' | cat(context) <generator object extend_type_gen at ...> >>> '{path}testfile.txt' | cat(context).tostr() 'here is the file content' >>> cat('/tmp/testfile.txt').s 'here is the file content' >>> cat('/tmp/testfile.txt').upper().s 'HERE IS THE FILE CONTENT' >>> cat('/tmp/testfile.txt').l ['here is the file content'] >>> cat('/tmp/testfile.txt').g <generator object extend_type_gen at ...> >>> for line in cat('/tmp/testfile.txt'): ... print line ... here is the file content >>> for bits in cat('/tmp/testfile.txt').grep('content').cut(): ... print bits ... ['here', 'is', 'the', 'file', 'content'] >>> open('/tmp/testfile.txt','w').write('here is the file content\nanother line') >>> '/tmp/testfile.txt' | cat().tostr() 'here is the file content\nanother line' >>> '/tmp/testfile.txt' | cat().tolist() ['here is the file content', 'another line'] >>> cat('/tmp/testfile.txt').s 'here is the file content\nanother line' >>> cat('/tmp/testfile.txt').l ['here is the file content', 'another line'] >>> context = {'path': '/tmp/'} >>> cat('/{path}/testfile.txt',context).l ['here is the file content', 'another line'] >>> for bits in cat('/tmp/testfile.txt').grep('content').cut(): ... print bits ... ['here', 'is', 'the', 'file', 'content']
doreduce¶
- class textops.doreduce(reduce_fn, initializer=None)¶
Reduce the input text
Uses python reduce() function.
Parameters: Returns: reduced value
Return type: any
Examples
>>> import re >>> 'a1\nb2\nc3\nd4' | doreduce(lambda x,y:x+re.sub(r'\d','',y),'') 'abcd' >>> 'a1\nb2\nc3\nd4' >> doreduce(lambda x,y:x+re.sub(r'\d','',y),'') 'abcd'
doslice¶
- class textops.doslice(begin=0, end=sys.maxsize, step=1)¶
Get lines/items from begin line to end line with some step
Parameters: Returns: A slice of the original text
Return type: generator
Examples
>>> s='a\nb\nc\nd\ne\nf' >>> s | doslice(1,4).tolist() ['b', 'c', 'd'] >>> s >> doslice(1,4) ['b', 'c', 'd'] >>> s >> doslice(2) ['c', 'd', 'e', 'f'] >>> s >> doslice(0,4,2) ['a', 'c'] >>> s >> doslice(None,None,2) ['a', 'c', 'e']
first¶
- class textops.first()¶
Return the first line/item from the input text
Returns: the first line/item from the input text Return type: StrExt, ListExt or DictExt Examples
>>> 'a\nb\nc' | first() 'a' >>> ['a','b','c'] | first() 'a' >>> [('a',1),('b',2),('c',3)] | first() ['a', 1] >>> [['key1','val1','help1'],['key2','val2','help2']] | first() ['key1', 'val1', 'help1'] >>> [{'key':'a','val':1},{'key':'b','val':2},{'key':'c','val':3}] | first() {'key': 'a', 'val': 1}
formatdicts¶
- class textops.formatdicts(format_str='{key} : {val}\n', join_str = '', defvalue='-')¶
Formats list of dicts
Useful to convert list of dicts into a simple string. It converts the list of dicts into a list of strings by using the format_str, then it joins all the strings with join_str to get a unique simple string.
Parameters: Returns: formatted input
Return type: str
Examples
>>> input = [{'key':'a','val':1},{'key':'b','val':2},{'key':'c'}] >>> input | formatdicts() 'a : 1\nb : 2\nc : -\n' >>> input | formatdicts('{key} -> {val}\n',defvalue='N/A') 'a -> 1\nb -> 2\nc -> N/A\n' >>> input = [{'name':'Eric','age':47,'level':'guru'}, ... {'name':'Guido','age':59,'level':'god'}] >>> print input | formatdicts('{name}({age}) : {level}\n') Eric(47) : guru Guido(59) : god >>> print input | formatdicts('{name}', ', ') Eric, Guido
formatitems¶
- class textops.formatitems(format_str='{0} : {1}\n', join_str = '')¶
Formats list of 2-sized tuples
Useful to convert list of 2-sized tuples into a simple string It converts the list of tuple into a list of strings by using the format_str, then it joins all the strings with join_str to get a unique simple string.
Parameters: Returns: formatted input
Return type: str
Examples
>>> [('key1','val1'),('key2','val2')] | formatitems('{0} -> {1}\n') 'key1 -> val1\nkey2 -> val2\n' >>> [('key1','val1'),('key2','val2')] | formatitems('{0}:{1}',', ') 'key1:val1, key2:val2'
formatlists¶
- class textops.formatlists(format_str='{0} : {1}\n', join_str = '')¶
Formats list of lists
Useful to convert list of lists into a simple string It converts the list of lists into a list of strings by using the format_str, then it joins all the strings with join_str to get a unique simple string.
Parameters: Returns: formatted input
Return type: str
Examples
>>> [['key1','val1','help1'],['key2','val2','help2']] | formatlists('{2} : {0} -> {1}\n') 'help1 : key1 -> val1\nhelp2 : key2 -> val2\n' >>> [['key1','val1','help1'],['key2','val2','help2']] | formatlists('{0}:{1} ({2})',', ') 'key1:val1 (help1), key2:val2 (help2)'
greaterequal¶
- class textops.greaterequal(value, key=None)¶
Extract lines with value strictly less than specified string
It works like textops.greaterthan except the test is “greater than or equal to”
Parameters:
- value (str) – string to test with
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values greater than or equal to the specified value
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | greaterequal('2015-09-14 ccc').tolist() ['2015-09-14 ccc', '2015-11-05 ddd'] >>> logs >> greaterequal('2015-09-14 ccc') ['2015-09-14 ccc', '2015-11-05 ddd']
greaterthan¶
- class textops.greaterthan(value, key=None)¶
Extract lines with value strictly less than specified string
It works like textops.lessthan except the test is “greater than”
Parameters:
- value (str) – string to test with
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values greater than the specified value
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | greaterthan('2015-09-14 ccc').tolist() ['2015-11-05 ddd'] >>> logs >> greaterthan('2015-09-14 ccc') ['2015-11-05 ddd']
grep¶
- class textops.grep(pattern, key=None)¶
Select lines having a specified pattern
This works like the shell command ‘egrep’ : it will filter the input text and retain only lines matching the pattern.
It works for any kind of list of strings, but also for list of lists and list of dicts. In these cases, one can test only one column or one key but return the whole list/dict. before testing, the object to be tested is converted into a string with str() so the grep will work for any kind of object.
Parameters:
- pattern (str) – a regular expression string (case sensitive)
- key (int or str) – test only one column or one key (optional)
Yields: str, list or dict – the filtered input text
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grep('error') <generator object extend_type_gen at ...> >>> input | grep('error').tolist() ['error1', 'error2'] >>> input >> grep('error') ['error1', 'error2'] >>> input | grep('ERROR').tolist() [] >>> input | grep('error|warning').tolist() ['error1', 'error2', 'warning1', 'warning2'] >>> input | cutca(r'(\D+)(\d+)') [('error', '1'), ('error', '2'), ('warning', '1'), ('info', '1'), ('warning', '2'), ('info', '2')] >>> input | cutca(r'(\D+)(\d+)').grep('1',1).tolist() [('error', '1'), ('warning', '1'), ('info', '1')] >>> input | cutdct(r'(?P<level>\D+)(?P<nb>\d+)') [{'nb': '1', 'level': 'error'}, {'nb': '2', 'level': 'error'}, {'nb': '1', 'level': 'warning'}, {'nb': '1', 'level': 'info'}, {'nb': '2', 'level': 'warning'}, {'nb': '2', 'level': 'info'}] >>> input | cutdct(r'(?P<level>\D+)(?P<nb>\d+)').grep('1','nb').tolist() [{'nb': '1', 'level': 'error'}, {'nb': '1', 'level': 'warning'}, {'nb': '1', 'level': 'info'}] >>> [{'more simple':1},{'way to grep':2},{'list of dicts':3}] | grep('way').tolist() [{'way to grep': 2}] >>> [{'more simple':1},{'way to grep':2},{'list of dicts':3}] | grep('3').tolist() [{'list of dicts': 3}]
grepi¶
- class textops.grepi(pattern, key=None)¶
grep case insensitive
This works like textops.grep, except it is case insensitive.
Parameters:
- pattern (str) – a regular expression string (case insensitive)
- key (int or str) – test only one column or one key (optional)
Yields: str, list or dict – the filtered input text
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepi('ERROR').tolist() ['error1', 'error2'] >>> input >> grepi('ERROR') ['error1', 'error2']
grepv¶
- class textops.grepv(pattern, key=None)¶
grep with inverted matching
This works like textops.grep, except it returns lines that does NOT match the specified pattern.
Parameters:
- pattern (str) – a regular expression string
- key (int or str) – test only one column or one key (optional)
Yields: str, list or dict – the filtered input text
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepv('error').tolist() ['warning1', 'info1', 'warning2', 'info2'] >>> input >> grepv('error') ['warning1', 'info1', 'warning2', 'info2'] >>> input | grepv('ERROR').tolist() ['error1', 'error2', 'warning1', 'info1', 'warning2', 'info2']
grepvi¶
- class textops.grepvi(pattern, key=None)¶
grep case insensitive with inverted matching
This works like textops.grepv, except it is case insensitive.
Parameters:
- pattern (str) – a regular expression string (case insensitive)
- key (int or str) – test only one column or one key (optional)
Yields: str, list or dict – the filtered input text
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepvi('ERROR').tolist() ['warning1', 'info1', 'warning2', 'info2'] >>> input >> grepvi('ERROR') ['warning1', 'info1', 'warning2', 'info2']
grepc¶
- class textops.grepc(pattern, key=None)¶
Count lines having a specified pattern
This works like textops.grep except that instead of filtering the input text, it counts lines matching the pattern.
Parameters:
- pattern (str) – a regular expression string (case sensitive)
- key (int or str) – test only one column or one key (optional)
Returns: the matched lines count
Return type: int
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepc('error') 2 >>> input | grepc('ERROR') 0 >>> input | grepc('error|warning') 4 >>> [{'more simple':1},{'way to grep':2},{'list of dicts':3}] | grepc('3') 1 >>> [{'more simple':1},{'way to grep':2},{'list of dicts':3}] | grepc('2','way to grep') 1
grepci¶
- class textops.grepci(pattern, key=None)¶
Count lines having a specified pattern (case insensitive)
This works like textops.grepc except that the pattern is case insensitive
Parameters:
- pattern (str) – a regular expression string (case insensitive)
- key (int or str) – test only one column or one key (optional)
Returns: the matched lines count
Return type: int
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepci('ERROR') 2
grepcv¶
- class textops.grepcv(pattern, key=None)¶
Count lines NOT having a specified pattern
This works like textops.grepc except that it counts line that does NOT match the pattern.
Parameters:
- pattern (str) – a regular expression string (case sensitive)
- key (int or str) – test only one column or one key (optional)
Returns: the NOT matched lines count
Return type: int
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepcv('error') 4 >>> input | grepcv('ERROR') 6
grepcvi¶
- class textops.grepcvi(pattern, key=None)¶
Count lines NOT having a specified pattern (case insensitive)
This works like textops.grepcv except that the pattern is case insensitive
Parameters:
- pattern (str) – a regular expression string (case insensitive)
- key (int or str) – test only one column or one key (optional)
Returns: the NOT matched lines count
Return type: int
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | grepcvi('ERROR') 4
haspattern¶
- class textops.haspattern(pattern, key=None)¶
Tests if the input text matches the specified pattern
This reads the input text line by line (or item by item for lists and generators), cast into a string before testing. like textops.grepc it accepts testing on a specific column for a list of lists or testing on a specific key for list of dicts. It stops reading the input text as soon as the pattern is found : it is useful for big input text.
Parameters:
- pattern (str) – a regular expression string (case sensitive)
- key (int or str) – test only one column or one key (optional)
Returns: True if the pattern is found.
Return type: bool
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | haspattern('error') True >>> input | haspattern('ERROR') False
haspatterni¶
- class textops.haspatterni(pattern, key=None)¶
Tests if the input text matches the specified pattern
Works like textops.haspattern except that it is case insensitive.
Parameters:
- pattern (str) – a regular expression string (case insensitive)
- key (int or str) – test only one column or one key (optional)
Returns: True if the pattern is found.
Return type: bool
Examples
>>> input = 'error1\nerror2\nwarning1\ninfo1\nwarning2\ninfo2' >>> input | haspatterni('ERROR') True
head¶
- class textops.head(lines)¶
Return first lines from the input text
Parameters: lines (int) – The number of lines/items to return. Yields: str, lists or dicts – the first ‘lines’ lines from the input text Examples
>>> 'a\nb\nc' | head(2).tostr() 'a\nb' >>> for l in 'a\nb\nc' | head(2): ... print l a b >>> ['a','b','c'] | head(2).tolist() ['a', 'b'] >>> ['a','b','c'] >> head(2) ['a', 'b'] >>> [('a',1),('b',2),('c',3)] | head(2).tolist() [('a', 1), ('b', 2)] >>> [{'key':'a','val':1},{'key':'b','val':2},{'key':'c','val':3}] | head(2).tolist() [{'val': 1, 'key': 'a'}, {'val': 2, 'key': 'b'}]
iffn¶
- class textops.iffn(filter_fn=None)¶
Filters the input text with a specified function
It works like the python filter() fonction.
Parameters: Yields: any – lines filtered by the filter_fn function
Examples
>>> import re >>> 'line1\nline2\nline3\nline4' | iffn(lambda l:int(re.sub(r'\D','',l)) % 2).tolist() ['line1', 'line3'] >>> 'line1\nline2\nline3\nline4' >> iffn(lambda l:int(re.sub(r'\D','',l)) % 2) ['line1', 'line3']
inrange¶
- class textops.inrange(begin, end, get_begin=True, get_end=False, key=None)¶
Extract lines between a range of strings
For each input line, it tests whether it is greater or equal than begin argument and strictly less than end argument. At the opposite of textops.between, there no need to match begin or end string.
inrange works for any kind of list of strings, but also for list of lists and list of dicts. In these cases, one can test only one column or one key but return the whole list/dict.
Each strings that will be tested is converted with the same type of the first argument.
Parameters:
- begin (str) – range begin string
- end (str) – range end string
- get_begin (bool) – if True : include lines having the same value as the range begin, Default : True
- get_end (bool) – if True : include lines having the same value as the range end, Default : False
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values inside the specified range
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | inrange('2015-08-12','2015-11-05').tolist() ['2015-08-23 bbbb', '2015-09-14 ccc'] >>> logs >> inrange('2015-08-12','2015-11-05') ['2015-08-23 bbbb', '2015-09-14 ccc']>>> logs = '''aaaa 2015-08-11 ... bbbb 2015-08-23 ... cccc 2015-09-14 ... dddd 2015-11-05''' >>> logs >> inrange('2015-08-12','2015-11-05') [] >>> logs >> inrange('2015-08-12','2015-11-05',key=lambda l:l.cut(col=1)) ['bbbb 2015-08-23', 'cccc 2015-09-14']>>> logs = [ ('aaaa','2015-08-11'), ... ('bbbb','2015-08-23'), ... ('ccc','2015-09-14'), ... ('ddd','2015-11-05') ] >>> logs | inrange('2015-08-12','2015-11-05',key=1).tolist() [('bbbb', '2015-08-23'), ('ccc', '2015-09-14')]>>> logs = [ {'data':'aaaa','date':'2015-08-11'}, ... {'data':'bbbb','date':'2015-08-23'}, ... {'data':'ccc','date':'2015-09-14'}, ... {'data':'ddd','date':'2015-11-05'} ] >>> logs | inrange('2015-08-12','2015-11-05',key='date').tolist() [{'date': '2015-08-23', 'data': 'bbbb'}, {'date': '2015-09-14', 'data': 'ccc'}]>>> ints = '1\n2\n01\n02\n11\n12\n22\n20' >>> ints | inrange(1,3).tolist() ['1', '2', '01', '02'] >>> ints | inrange('1','3').tolist() ['1', '2', '11', '12', '22', '20'] >>> ints | inrange('1','3',get_begin=False).tolist() ['2', '11', '12', '22', '20']
last¶
- class textops.last()¶
Return the last line/item from the input text
Returns: the last line/item from the input text Return type: StrExt, ListExt or DictExt Examples
>>> 'a\nb\nc' | last() 'c' >>> ['a','b','c'] | last() 'c' >>> [('a',1),('b',2),('c',3)] | last() ['c', 3] >>> [['key1','val1','help1'],['key2','val2','help2']] | last() ['key2', 'val2', 'help2'] >>> [{'key':'a','val':1},{'key':'b','val':2},{'key':'c','val':3}] | last() {'key': 'c', 'val': 3}
lessequal¶
- class textops.lessequal(value, key=None)¶
Extract lines with value strictly less than specified string
It works like textops.lessthan except the test is “less or equal”
Parameters:
- value (str) – string to test with
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values less than or equal to the specified value
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | lessequal('2015-09-14').tolist() ['2015-08-11 aaaa', '2015-08-23 bbbb'] >>> logs >> lessequal('2015-09-14') ['2015-08-11 aaaa', '2015-08-23 bbbb'] >>> logs | lessequal('2015-09-14 ccc').tolist() ['2015-08-11 aaaa', '2015-08-23 bbbb', '2015-09-14 ccc']
lessthan¶
- class textops.lessthan(value, key=None)¶
Extract lines with value strictly less than specified string
It works for any kind of list of strings, but also for list of lists and list of dicts. In these cases, one can test only one column or one key but return the whole list/dict.
Each strings that will be tested is temporarily converted with the same type as the first argument given to lessthan (see examples).
Parameters:
- value (str) – string to test with
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values strictly less than the specified reference value
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | lessthan('2015-09-14').tolist() ['2015-08-11 aaaa', '2015-08-23 bbbb'] >>> logs = [ ('aaaa','2015-08-11'), ... ('bbbb','2015-08-23'), ... ('ccc','2015-09-14'), ... ('ddd','2015-11-05') ] >>> logs | lessthan('2015-11-05',key=1).tolist() [('aaaa', '2015-08-11'), ('bbbb', '2015-08-23'), ('ccc', '2015-09-14')] >>> logs = [ {'data':'aaaa','date':'2015-08-11'}, ... {'data':'bbbb','date':'2015-08-23'}, ... {'data':'ccc','date':'2015-09-14'}, ... {'data':'ddd','date':'2015-11-05'} ] >>> logs | lessthan('2015-09-14',key='date').tolist() [{'date': '2015-08-11', 'data': 'aaaa'}, {'date': '2015-08-23', 'data': 'bbbb'}] >>> ints = '1\n2\n01\n02\n11\n12\n22\n20' >>> ints | lessthan(3).tolist() ['1', '2', '01', '02'] >>> ints | lessthan('3').tolist() ['1', '2', '01', '02', '11', '12', '22', '20']
merge_dicts¶
- class textops.merge_dicts¶
Merge a list of dicts into one single dict
Returns: merged dicts Return type: dict Examples
>>> pattern=r'item="(?P<item>[^"]*)" count="(?P<i_count>[^"]*)" price="(?P<i_price>[^"]*)"' >>> s='item="col1" count="col2" price="col3"\nitem="col11" count="col22" price="col33"' >>> s | cutkv(pattern,key_name='item') [{'col1': {'item': 'col1', 'i_price': 'col3', 'i_count': 'col2'}},... {'col11': {'item': 'col11', 'i_price': 'col33', 'i_count': 'col22'}}] >>> s | cutkv(pattern,key_name='item').merge_dicts() {'col11': {'item': 'col11', 'i_price': 'col33', 'i_count': 'col22'},... 'col1': {'item': 'col1', 'i_price': 'col3', 'i_count': 'col2'}}
mapfn¶
- class textops.mapfn(map_fn)¶
Apply a specified function on every line
It works like the python map() function.
Parameters: map_fn (callable) – a function or a callable to apply on every line Yields: any – lines modified by the map_fn function Examples
>>> ['a','b','c'] | mapfn(lambda l:l*2).tolist() ['aa', 'bb', 'cc'] >>> ['a','b','c'] >> mapfn(lambda l:l*2) ['aa', 'bb', 'cc']
mapif¶
- class textops.mapif(map_fn, filter_fn=None)¶
Filters and maps the input text with 2 specified functions
Filters input text AND apply a map function on every filtered lines.
Parameters: Yields: any – lines filtered by the filter_fn function and modified by map_fn function
Examples
>>> import re >>> 'a1\nb2\nc3\nd4' | mapif(lambda l:l*2,lambda l:int(re.sub(r'\D','',l)) % 2).tolist() ['a1a1', 'c3c3'] >>> 'a1\nb2\nc3\nd4' >> mapif(lambda l:l*2,lambda l:int(re.sub(r'\D','',l)) % 2) ['a1a1', 'c3c3']
mrun¶
- class textops.mrun(context={})¶
Run multiple commands from the input text and return execution output
This works like textops.run except that each line of the input text will be used as a command.The input text must be a list of strings (list, generator, or newline separated), not a list of lists. Commands will be executed inside a shell.If a context dict is specified, commands are formatted with that context (str.format)
Parameters: context (dict) – The context to format the command to run Yields: str – the execution output Examples
>>> cmds = 'mkdir -p /tmp/textops_tests_run\n' >>> cmds+= 'cd /tmp/textops_tests_run;touch f1 f2 f3\n' >>> cmds+= 'ls /tmp/textops_tests_run' >>> print cmds | mrun().tostr() f1 f2 f3 >>> cmds=['mkdir -p /tmp/textops_tests_run', ... 'cd /tmp/textops_tests_run; touch f1 f2 f3'] >>> cmds.append('ls /tmp/textops_tests_run') >>> print cmds | mrun().tostr() f1 f2 f3 >>> print cmds >> mrun() ['f1', 'f2', 'f3'] >>> cmds = ['ls {path}', 'echo "Cool !"'] >>> print cmds | mrun({'path':'/tmp/textops_tests_run'}).tostr() f1 f2 f3 Cool !
outrange¶
- class textops.outrange(begin, end, get_begin=False, get_end=False, key=None)¶
Extract lines NOT between a range of strings
Works like textops.inrange except it yields lines that are NOT in the range
Parameters:
- begin (str) – range begin string
- end (str) – range end string
- get_begin (bool) – if True : include lines having the same value as the range begin, Default : False
- get_end (bool) – if True : include lines having the same value as the range end, Default : False
- key (int or str or callable) –
Specify what should really be compared:
- None : the whole current line,
- an int : test only the specified column (for list or lists),
- a string : test only the dict value for the specified key (for list of dicts),
- a callable : it will receive the line being tested and return the string to really compare.
Note : key argument MUST BE PASSED BY NAME
Yields: str or list or dict – lines having values outside the specified range
Examples
>>> logs = '''2015-08-11 aaaa ... 2015-08-23 bbbb ... 2015-09-14 ccc ... 2015-11-05 ddd''' >>> logs | outrange('2015-08-12','2015-11-05').tolist() ['2015-08-11 aaaa', '2015-11-05 ddd'] >>> logs | outrange('2015-08-23 bbbb','2015-09-14 ccc').tolist() ['2015-08-11 aaaa', '2015-11-05 ddd'] >>> logs | outrange('2015-08-23 bbbb','2015-09-14 ccc', get_begin=True).tolist() ['2015-08-11 aaaa', '2015-08-23 bbbb', '2015-11-05 ddd']
renderdicts¶
- class textops.renderdicts(format_str='{key} : {val}', defvalue='-')¶
Formats list of dicts
It works like renderdicts except it does NOT do the final join.
Parameters: Returns: list of formatted string
Return type: generator of strings
Examples
>>> input = [{'key':'a','val':1},{'key':'b','val':2},{'key':'c'}] >>> input >> renderdicts() ['a : 1', 'b : 2', 'c : -'] >>> input >> renderdicts('{key} -> {val}',defvalue='N/A') ['a -> 1', 'b -> 2', 'c -> N/A'] >>> input = [{'name':'Eric','age':47,'level':'guru'}, ... {'name':'Guido','age':59,'level':'god'}] >>> input >> renderdicts('{name}({age}) : {level}') ['Eric(47) : guru', 'Guido(59) : god'] >>> input >> renderdicts('{name}') ['Eric', 'Guido']
renderitems¶
- class textops.renderitems(format_str='{0} : {1}')¶
Renders list of 2-sized tuples
It works like formatitems except it does NOT do the final join.
Parameters: format_str (str) – format string, default is ‘{0} : {1}’ Returns: list of formatted string Return type: generator of strings Examples
>>> [('key1','val1'),('key2','val2')] >> renderitems('{0} -> {1}') ['key1 -> val1', 'key2 -> val2'] >>> [('key1','val1'),('key2','val2')] >> renderitems('{0}:{1}') ['key1:val1', 'key2:val2']
renderlists¶
- class textops.renderlists(format_str='{0} : {1}')¶
Formats list of lists
It works like formatlists except it does NOT do the final join.
Parameters: format_str (str) – format string, default is ‘{0} : {1}’ Returns: list of formatted string Return type: generator of strings Examples
>>> [['key1','val1','help1'],['key2','val2','help2']] >> renderlists('{2} : {0} -> {1}') ['help1 : key1 -> val1', 'help2 : key2 -> val2'] >>> [['key1','val1','help1'],['key2','val2','help2']] >> renderlists('{0}:{1} ({2})') ['key1:val1 (help1)', 'key2:val2 (help2)']
resplitblock¶
- class textops.resplitblock(pattern, include_separator=0, skip_first=False)¶
split a text into blocks using re.finditer()
This works like textops.splitblock except that is uses re : it is faster and gives the possibility to search multiple lines patterns. BUT, the whole input text must fit into memory. List of strings are also converted into a single string with newlines during the process.
Parameters:
- pattern (str) – The pattern to find
- include_separator (int) –
Tells whether blocks must include searched pattern
- 0 or SPLIT_SEP_NONE : no,
- 1 or SPLIT_SEP_BEGIN : yes, at block beginning,
- 2 or SPLIT_SEP_END : yes, at block ending
Default: 0
- skip_first (bool) – If True, the result will not contain the block before the first pattern found. Default : False.
Returns: splitted input text
Return type: generator
Examples
>>> s=''' ... this ... is ... section 1 ... ================= ... this ... is ... section 2 ... ================= ... this ... is ... section 3 ... ''' >>> s >> resplitblock(r'^======+$') ['\nthis\nis\nsection 1\n', '\nthis\nis\nsection 2\n', '\nthis\nis\nsection 3\n'] >>> s >> resplitblock(r'^======+$',skip_first=True) ['\nthis\nis\nsection 2\n', '\nthis\nis\nsection 3\n']>>> s='''Section: 1 ... info 1.1 ... info 1.2 ... Section: 2 ... info 2.1 ... info 2.2 ... Section: 3 ... info 3.1 ... info 3.2''' >>> s >> resplitblock(r'^Section:',SPLIT_SEP_BEGIN) ['', 'Section: 1\ninfo 1.1\ninfo 1.2\n', 'Section: 2\ninfo 2.1\ninfo 2.2\n', 'Section: 3\ninfo 3.1\ninfo 3.2'] >>> s >> resplitblock(r'^Section:',SPLIT_SEP_BEGIN,True) ['Section: 1\ninfo 1.1\ninfo 1.2\n', 'Section: 2\ninfo 2.1\ninfo 2.2\n', 'Section: 3\ninfo 3.1\ninfo 3.2']>>> s='''info 1.1 ... Last info 1.2 ... info 2.1 ... Last info 2.2 ... info 3.1 ... Last info 3.2''' >>> s >> resplitblock(r'^Last info[^\n\r]*[\n\r]?',SPLIT_SEP_END) ['info 1.1\nLast info 1.2\n', 'info 2.1\nLast info 2.2\n', 'info 3.1\nLast info 3.2']>>> s=''' ... ========= ... Section 1 ... ========= ... info 1.1 ... info 1.2 ... ========= ... Section 2 ... ========= ... info 2.1 ... info 2.2 ... ''' >>> s >> resplitblock('^===+\n[^\n]+\n===+\n') ['\n', 'info 1.1\ninfo 1.2\n', 'info 2.1\ninfo 2.2\n'] >>> s >> resplitblock('^===+\n[^\n]+\n===+\n',SPLIT_SEP_BEGIN) ['\n', '=========\nSection 1\n=========\ninfo 1.1\ninfo 1.2\n', '=========\nSection 2\n=========\ninfo 2.1\ninfo 2.2\n'] >>> s >> resplitblock('^===+\n[^\n]+\n===+\n',SPLIT_SEP_BEGIN, True) ['=========\nSection 1\n=========\ninfo 1.1\ninfo 1.2\n', '=========\nSection 2\n=========\ninfo 2.1\ninfo 2.2\n']
run¶
- class textops.run(context={})¶
Run the command from the input text and return execution output
This text operation use subprocess.Popen to call the command.If the command is a string, it will be executed within a shell.If the command is a list (the command and its arguments), the command is executed without a shell.If a context dict is specified, the command is formatted with that context (str.format)
Parameters: context (dict) – The context to format the command to run Yields: str – the execution output Examples
>>> cmd = 'mkdir -p /tmp/textops_tests_run;\ ... cd /tmp/textops_tests_run; touch f1 f2 f3; ls' >>> print cmd | run().tostr() f1 f2 f3 >>> print cmd >> run() ['f1', 'f2', 'f3'] >>> print ['ls', '/tmp/textops_tests_run'] | run().tostr() f1 f2 f3 >>> print ['ls', '{path}'] | run({'path':'/tmp/textops_tests_run'}).tostr() f1 f2 f3
sed¶
- class textops.sed(pat, repl)¶
Replace pattern on-the-fly
Works like the shell command ‘sed’. It uses re.sub() to replace the pattern, this means that you can include back-reference into the replacement string.
Parameters: Yields: str – the replaced lines from the input text
Examples
>>> 'Hello Eric\nHello Guido' | sed('Hello','Bonjour').tostr() 'Bonjour Eric\nBonjour Guido' >>> [ 'Hello Eric','Hello Guido'] | sed('Hello','Bonjour').tolist() ['Bonjour Eric', 'Bonjour Guido'] >>> [ 'Hello Eric','Hello Guido'] >> sed('Hello','Bonjour') ['Bonjour Eric', 'Bonjour Guido'] >>> [ 'Hello Eric','Hello Guido'] | sed(r'$',' !').tolist() ['Hello Eric !', 'Hello Guido !'] >>> import re >>> [ 'Hello Eric','Hello Guido'] | sed(re.compile('hello',re.I),'Good bye').tolist() ['Good bye Eric', 'Good bye Guido'] >>> [ 'Hello Eric','Hello Guido'] | sed('hello','Good bye').tolist() ['Hello Eric', 'Hello Guido']
sedi¶
- class textops.sedi(pat, repl)¶
Replace pattern on-the-fly (case insensitive)
Works like textops.sed except that the string as the search pattern is case insensitive.
Parameters: Yields: str – the replaced lines from the input text
Examples
>>> [ 'Hello Eric','Hello Guido'] | sedi('hello','Good bye').tolist() ['Good bye Eric', 'Good bye Guido'] >>> [ 'Hello Eric','Hello Guido'] >> sedi('hello','Good bye') ['Good bye Eric', 'Good bye Guido']
skip¶
- class textops.skip(lines)¶
Skip n lines
It will return the input text except the n first lines
Parameters: lines (int) – The number of lines/items to skip. Yields: str, lists or dicts – skip ‘lines’ lines from the input text Examples
>>> 'a\nb\nc' | skip(1).tostr() 'b\nc' >>> for l in 'a\nb\nc' | skip(1): ... print l b c >>> ['a','b','c'] | skip(1).tolist() ['b', 'c'] >>> ['a','b','c'] >> skip(1) ['b', 'c'] >>> [('a',1),('b',2),('c',3)] | skip(1).tolist() [('b', 2), ('c', 3)] >>> [{'key':'a','val':1},{'key':'b','val':2},{'key':'c','val':3}] | skip(1).tolist() [{'val': 2, 'key': 'b'}, {'val': 3, 'key': 'c'}]
span¶
- class textops.span(nbcols, fill_str='')¶
Ensure that a list of lists has exactly the specified number of column
This is useful in for-loop with multiple assignment
Parameters: Returns: A list with exactly nbcols columns
Return type: list
Examples
>>> s='a\nb c\nd e f g h\ni j k\n\n' >>> s | cut() [['a'], ['b', 'c'], ['d', 'e', 'f', 'g', 'h'], ['i', 'j', 'k'], []] >>> s | cut().span(3,'-').tolist() [['a', '-', '-'], ['b', 'c', '-'], ['d', 'e', 'f'], ['i', 'j', 'k'], ['-', '-', '-']] >>> s >> cut().span(3,'-') [['a', '-', '-'], ['b', 'c', '-'], ['d', 'e', 'f'], ['i', 'j', 'k'], ['-', '-', '-']] >>> for x,y,z in s | cut().span(3,'-'): ... print x,y,z a - - b c - d e f i j k - - -
splitblock¶
- class textops.splitblock(pattern, include_separator=0, skip_first=False)¶
split a text into blocks
This operation split a text that has several blocks seperated by a same pattern. The separator pattern must fit into one line, by this way, this operation is not limited with the input text size, nevertheless one block must fit in memory (ie : input text can include an unlimited number of blocks that must fit into memory one-by-one)
Parameters:
- pattern (str) – The pattern to find
- include_separator (int) –
Tells whether blocks must include searched pattern
- 0 or SPLIT_SEP_NONE : no,
- 1 or SPLIT_SEP_BEGIN : yes, at block beginning,
- 2 or SPLIT_SEP_END : yes, at block ending
Default: 0
- skip_first (bool) – If True, the result will not contain the block before the first pattern found. Default : False.
Returns: splitted input text
Return type: generator
Examples
>>> s=''' ... this ... is ... section 1 ... ================= ... this ... is ... section 2 ... ================= ... this ... is ... section 3 ... ''' >>> s >> splitblock(r'^======+$') [['', 'this', 'is', 'section 1'], ['this', 'is', 'section 2'], ['this', 'is', 'section 3']] >>> s >> splitblock(r'^======+$',skip_first=True) [['this', 'is', 'section 2'], ['this', 'is', 'section 3']]>>> s='''Section: 1 ... info 1.1 ... info 1.2 ... Section: 2 ... info 2.1 ... info 2.2 ... Section: 3 ... info 3.1 ... info 3.2''' >>> s >> splitblock(r'^Section:',SPLIT_SEP_BEGIN) [[], ['Section: 1', 'info 1.1', 'info 1.2'], ['Section: 2', 'info 2.1', 'info 2.2'], ['Section: 3', 'info 3.1', 'info 3.2']] >>> s >> splitblock(r'^Section:',SPLIT_SEP_BEGIN,True) [['Section: 1', 'info 1.1', 'info 1.2'], ['Section: 2', 'info 2.1', 'info 2.2'], ['Section: 3', 'info 3.1', 'info 3.2']]>>> s='''info 1.1 ... Last info 1.2 ... info 2.1 ... Last info 2.2 ... info 3.1 ... Last info 3.2''' >>> s >> splitblock(r'^Last info',SPLIT_SEP_END) [['info 1.1', 'Last info 1.2'], ['info 2.1', 'Last info 2.2'], ['info 3.1', 'Last info 3.2']]
subslice¶
- class textops.subslice(begin=0, end=sys.maxsize, step=1)¶
Get a slice of columns for list of lists
Parameters: Returns: A slice of the original text
Return type: generator
Examples
>>> s='a\nb c\nd e f g h\ni j k\n\n' >>> s | cut().span(3,'-').tolist() [['a', '-', '-'], ['b', 'c', '-'], ['d', 'e', 'f'], ['i', 'j', 'k'], ['-', '-', '-']] >>> s | cut().span(3,'-').subslice(1,3).tolist() [['-', '-'], ['c', '-'], ['e', 'f'], ['j', 'k'], ['-', '-']] >>> s >> cut().span(3,'-').subslice(1,3) [['-', '-'], ['c', '-'], ['e', 'f'], ['j', 'k'], ['-', '-']]
subitem¶
- class textops.subitem(n)¶
Get a specified column for list of lists
Parameters: n (int) – column number to get. Returns: A list Return type: generator Examples
>>> s='a\nb c\nd e f g h\ni j k\n\n' >>> s | cut().span(3,'-').tolist() [['a', '-', '-'], ['b', 'c', '-'], ['d', 'e', 'f'], ['i', 'j', 'k'], ['-', '-', '-']] >>> s | cut().span(3,'-').subitem(1).tolist() ['-', 'c', 'e', 'j', '-'] >>> s >> cut().span(3,'-').subitem(1) ['-', 'c', 'e', 'j', '-'] >>> s >> cut().span(3,'-').subitem(-1) ['-', '-', 'f', 'k', '-']
subitems¶
- class textops.subitem(ntab)
Get a specified column for list of lists
Parameters: n (int) – column number to get. Returns: A list Return type: generator Examples
>>> s='a\nb c\nd e f g h\ni j k\n\n' >>> s | cut().span(3,'-').tolist() [['a', '-', '-'], ['b', 'c', '-'], ['d', 'e', 'f'], ['i', 'j', 'k'], ['-', '-', '-']] >>> s | cut().span(3,'-').subitem(1).tolist() ['-', 'c', 'e', 'j', '-'] >>> s >> cut().span(3,'-').subitem(1) ['-', 'c', 'e', 'j', '-'] >>> s >> cut().span(3,'-').subitem(-1) ['-', '-', 'f', 'k', '-']
tail¶
- class textops.tail(lines)¶
Return last lines from the input text
Parameters: lines (int) – The number of lines/items to return. Yields: str, lists or dicts – the last ‘lines’ lines from the input text Examples
>>> 'a\nb\nc' | tail(2).tostr() 'b\nc' >>> for l in 'a\nb\nc' | tail(2): ... print l b c >>> ['a','b','c'] | tail(2).tolist() ['b', 'c'] >>> ['a','b','c'] >> tail(2) ['b', 'c'] >>> [('a',1),('b',2),('c',3)] | tail(2).tolist() [('b', 2), ('c', 3)] >>> [{'key':'a','val':1},{'key':'b','val':2},{'key':'c','val':3}] | tail(2).tolist() [{'val': 2, 'key': 'b'}, {'val': 3, 'key': 'c'}]
uniq¶
- class textops.uniq¶
Remove all line repetitions
If a line is many times in the same text (even if there are some different lines between), only the first will be taken. Works also with list of lists or dicts.
Returns: Unified text line by line. Return type: generator Examples
>>> s='f\na\nb\na\nc\nc\ne\na\nc\nf' >>> s >> uniq() ['f', 'a', 'b', 'c', 'e'] >>> for line in s | uniq(): ... print line f a b c e >>> l = [ [1,2], [3,4], [1,2] ] >>> l >> uniq() [[1, 2], [3, 4]] >>> d = [ {'a':1}, {'b':2}, {'a':1} ] >>> d >> uniq() [{'a': 1}, {'b': 2}]