• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Removing specific lines from a text file?

Dean

Well-known member
#1
I've go a reasonably long file, and 99% of it are lines like these:
127.0.0.1 - - [01/Jul/2011:02:04:29 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:04:30 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:05:01 -0700] "GET /whm-server-status HTTP/1.0" 200 28577
127.0.0.1 - - [01/Jul/2011:02:05:01 -0700] "GET /whm-server-status?auto HTTP/1.1" 200 446
127.0.0.1 - - [01/Jul/2011:02:05:01 -0700] "GET /whm-server-status?auto HTTP/1.1" 200 446
127.0.0.1 - - [01/Jul/2011:02:05:02 -0700] "GET /whm-server-status?auto HTTP/1.1" 200 446
127.0.0.1 - - [01/Jul/2011:02:05:03 -0700] "GET / HTTP/1.0" 200 111
127.0.0.1 - - [01/Jul/2011:02:05:53 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:05:54 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:05:59 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:06:56 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:13 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
127.0.0.1 - - [01/Jul/2011:02:07:51 -0700] "OPTIONS * HTTP/1.0" 200 -
Any ideas how I could get rid of all lines that start with 127.0.0.1 ?
I'm sure there is some type of grep command, but I've no idea how to go about that. I cannot seem to do that with notepad++ on my pc locally either. All ideas welcome :)
(fingers tired)
 

ibnesayeed

Well-known member
#2
"-v, --invert-match" is the switch for invert match in grep. That means,

grep -v "useless" old-file > new-file
will remove all the lines from "old-file" containing the term "useless" and save the output in "new-file". You can also use a regular expression in place of "useless". From your command line issue "man grep" for more details. :)
 

Dean

Well-known member
#4
"-v, --invert-match" is the switch for invert match in grep. That means,

will remove all the lines from "old-file" containing the term "useless" and save the output in "new-file". You can also use a regular expression in place of "useless". From your command line issue "man grep" for more details. :)
You have made my fingers extremely happy :)

Thank You!!!

I may upgrade to that at some point. Thanks!
 

Dean

Well-known member
#5
Can I do the opposite?

Take the results of that filtered file and have just the lines containing 2 different IPs of interest piped into another file?

grep "IP1" "IP2" input_file >output_file

?
 

ibnesayeed

Well-known member
#6
Can I do the opposite?

Take the results of that filtered file and have just the lines containing 2 different IPs of interest piped into another file?

grep "IP1" "IP2" input_file >output_file

?
If you want to have an OR match then you can use something like this:

grep -E "foo|bar" old-file > new-file
this should grab all the lines containing "foo" OR "bar" (OR both in any order) from old-file and save it in new-file. Mind that -E switch is for --extended-regexp. Hence, you can go as creative as you like. :)
 

ibnesayeed

Well-known member
#7
I would like to add one more thing. If you want to filter it for IP1 and IP2 then you can apply the above grep expression on raw file as well. Invert match filter pre-processing will not make any difference. :)
 

Dean

Well-known member
#8
If you want to have an OR match then you can use something like this:

this should grab all the lines containing "foo" OR "bar" (OR both in any order) from old-file and save it in new-file. Mind that -E switch is for --extended-regexp. Hence, you can go as creative as you like. :)
Just to be clear, '\' is the OR? Or is the '|' the or...

I think it must be \ as you typed, because the | did not seem to work...
?
 

Dean

Well-known member
#10
It probably did not work for me because I was asking for things like ~ and %
I realized I could do the query / grep without using either of those, so I did that, and only needed to search for 1 unique item.
 
F

Floris

Guest
#11
or

if you just want 1 IP,

cat access_log |grep "thatiphere" > access_log_ip

that should work on most *nix systems.

Cleaning up a file with inverse would be a solution if the ip you're looking for is too dynamic, or more than one or two.