How to find the last field using 'cut'

linux bash cut

Without using sed or awk, only cut, how do I get the last field when the number of fields are unknown or change with every line?

Are you in love with cut command :)? why not any other Linux commands?

Without sed or awk: perl -pe 's/^.+\s+([^\s]+)$/$1/'.

possible duplicate of how to split a string in shell and get the last field

@MestreLion Many times people read a question to find a solution to a variation of a problem. This one starts with the false premise that cut supports something it doesn't. But I thought it was useful, in that it forces the reader to consider code that's easier to follow. I wanted a quick, simple way to use cut without needing to use multiple syntaxes for awk, grep, sed, etc. The rev thing did the trick; very elegant, and something I've never considered (even if clunky for other situations). I also liked reading the other approaches from the other answers.

Came here a real life problem: I want to find all the different file extensions in a source tree, to update a .gitattributes file with. So find | cut -d. -f<last> is the natural inclination

Sled

You could try something like this:

echo 'maps.google.com' | rev | cut -d'.' -f 1 | rev

Explanation

rev reverses "maps.google.com" to be moc.elgoog.spam

cut uses dot (ie '.') as the delimiter, and chooses the first field, which is moc

lastly, we reverse it again to get com

It not using only cut but it's without sed or awk.So what OP think?

@tom OP has asked more questions than just this in the last few hours. Based on our interactions with the OP we know that awk/sed/etc. are not allowed in his homework, but a reference to rev has not been made. So it was worth a shot

@zfus I see. Might want to stick another rev afterwards.

double rev great ideal!

Awesome, simple, perfect, thanks for explanation too - not enough people explaining each step in long chains of piped commands

Charles Duffy

Use a parameter expansion. This is much more efficient than any kind of external command, cut (or grep) included.

data=foo,bar,baz,qux
last=${data##*,}

See BashFAQ #100 for an introduction to native string manipulation in bash.

@ErwinWessels: Because bash is really slow. Use bash to run pipelines, not to process data in bulk. I mean, this is great if you have one line of text already in a shell variable, or if you want to do while IFS= read -ra array_var; do :;done <(cmd) to process a few lines. But for a big file, rev|cut|rev is probably faster! (And of course awk will be faster than that.)

@PeterCordes, awk will be faster for a big file, sure, but it takes a fair bit of input to overcome the constant-factor startup costs. (There also exist shells -- like ksh93 -- with performance closer to awk, where the syntax given in this answer remains valid; bash is exceptionally sluggish, but it's not even close to the only option available).

Thanks @PeterCordes; as usual I guess each tool has its use cases.

This is by far the fastest and most concise way of trimming down a single variable inside a bash script (assuming you're already using a bash script). No need to call anything external.

@Balmipour, ...however, rev is specific to whatever OS you're using that provides it -- it's not standardized across all UNIX systems. See the chapter listing for the POSIX section on commands and utilities -- it's not there. And ${var##prefix_pattern} is not in fact bash-specific; it's in the POSIX sh standard, see the end of section 2.6.2 (linked), so unlike rev, it's always available on any compliant shell.

tom

It is not possible using just cut. Here is a way using grep:

grep -o '[^,]*$'

Replace the comma for other delimiters.

Explanation:

-o (--only-matching) only outputs the part of the input that matches the pattern (the default is to print the entire line if it contains a match).

[^,] is a character class that matches any character other than a comma.

* matches the preceding pattern zero or more time, so [^,]* matches zero or more non‑comma characters.

$ matches the end of the string.

Putting this together, the pattern matches zero or more non-comma characters at the end of the string.

When there are multiple possible matches, grep prefers the one that starts earliest. So the entire last field will be matched.

Full example:

If we have a file called data.csv containing

one,two,three
foo,bar

then grep -o '[^,]*$' < data.csv will output

three
bar

To do the opposite, and find everything except the last field do: grep -o '^.*,'

This was especially useful, because rev add an issue multibyte unicode characters in my case.

I was trying to do this on MinGW but my grep version doesn't support -o, so I used sed 's/^.*,//' which replaces all characters up to and including the last comma with an empty string.

Amir Mehler

Without awk ?... But it's so simple with awk:

echo 'maps.google.com' | awk -F. '{print $NF}'

AWK is a way more powerful tool to have in your pocket. -F if for field separator NF is the number of fields (also stands for the index of the last)

This is universal and it works exactly as expected every time. In this scenario, using cut to achieve the OP's final output is like using a spoon to "cut" steak (pun intended :) ) . awk is the steak knife.

Avoid un-necessary use of echo that may slow down script for long files using awk -F. '{print $NF}' <<< 'maps.google.com'.

rjni

There are multiple ways. You may use this too.

echo "Your string here"| tr ' ' '\n' | tail -n1
> here

Obviously, the blank space input for tr command should be replaced with the delimiter you need.

This feels like the simplest answer to me, less pipes and clearer meaning

That will not work for an entire file, which is what the OP probably meant.

A friend

This is the only solution possible for using nothing but cut:

echo "s.t.r.i.n.g." | cut -d'.' -f2- [repeat_following_part_forever_or_until_out_of_memory:] | cut -d'.' -f2-

Using this solution, the number of fields can indeed be unknown and vary from time to time. However as line length must not exceed LINE_MAX characters or fields, including the new-line character, then an arbitrary number of fields can never be part as a real condition of this solution.

Yes, a very silly solution but the only one that meets the criterias I think.

Nice. Just take the last '.' off of "s.t.r.i.n.g." and this works.

I love when everyone says something is impossible and then someone chimes in with a working answer. Even if it is indeed very silly.

One could iterate cut -f2- in a loop until the output no longer changes.

I think you'd have to read the file line-by-line and then iterate the cut -f2- until it no longer changes. Otherwise you'd have to buffer the entire file.

jstine

If your input string doesn't contain forward slashes then you can use basename and a subshell:

$ basename "$(echo 'maps.google.com' | tr '.' '/')"

This doesn't use sed or awk but it also doesn't use cut either, so I'm not quite sure if it qualifies as an answer to the question as its worded.

This doesn't work well if processing input strings that can contain forward slashes. A workaround for that situation would be to replace forward slash with some other character that you know isn't part of a valid input string. For example, the pipe (|) character is also not allowed in filenames, so this would work:

$ basename "$(echo 'maps.google.com/some/url/things' | tr '/' '|' | tr '.' '/')" | tr '|' '/'

Of course the pipe character is allowed in filenames. Just try touch \|.

I will change from downvote to upvote if you remove the false claim about | being not allowed in file names. But almost every tr out there supports \0 or some other way of expressing the nul byte, and that definitely isn't allowed in file names, so you can use that as a place holder. Also tr ab bc just swaps all a and b without problems, so you can just avoid having to find a disallowed character entirely. Just pipe through tr './' './' once to swap before the basename and then again to swap back after.

Just realized I have a typo: "just pipe through tr '/.' './' once to swap before the basename and then again after".

user2166700

the following implements A friend's suggestion

#!/bin/bash
rcut(){

  nu="$( echo $1 | cut -d"$DELIM" -f 2-  )"
  if [ "$nu" != "$1" ]
  then
    rcut "$nu"
  else
    echo "$nu"
  fi
}

$ export DELIM=.
$ rcut a.b.c.d
d

You need double quotes around the arguments to echo in order for this to work reliably and robustly. See stackoverflow.com/questions/10067266/…

moni905

An alternative using perl would be:

perl -pe 's/(.*) (.*)$/$2/' file

where you may change \t for whichever the delimiter of file is

aperson1961

If you have a file named filelist.txt that is a list paths such as the following: c:/dir1/dir2/file1.h c:/dir1/dir2/dir3/file2.h

then you can do this: rev filelist.txt | cut -d"/" -f1 | rev

Kaffe Myers

Adding an approach to this old question just for the fun of it:

$ cat input.file # file containing input that needs to be processed
a;b;c;d;e
1;2;3;4;5
no delimiter here
124;adsf;15454
foo;bar;is;null;info

$ cat tmp.sh # showing off the script to do the job
#!/bin/bash
delim=';'
while read -r line; do  
    while [[ "$line" =~ "$delim" ]]; do
        line=$(cut -d"$delim" -f 2- <<<"$line")
    done
    echo "$line"
done < input.file

$ ./tmp.sh # output of above script/processed input file
e
5
no delimiter here
15454
info

Besides bash, only cut is used. Well, and echo, I guess.

Meh, why not just remove cut completely and only use bash... x] while read -r line; do echo ${line/*;}; done <input.file yields the same result.

jww

I realized if we just ensure a trailing delimiter exists, it works. So in my case I have comma and whitespace delimiters. I add a space at the end;

$ ans="a, b"
$ ans+=" "; echo ${ans} | tr ',' ' ' | tr -s ' ' | cut -d' ' -f2
b

And ans="a, b, c" produces b, which does not meet the requirements of "number of fields are unknown or change with every line".

How to find the last field using 'cut'

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Links

Contact US