ChatGPT解决这个技术问题 Extra ChatGPT

使用 ggplot2 仅将一个轴转换为 log10 比例

I have the following problem: I would like to visualize a discrete and a continuous variable on a boxplot in which the latter has a few extreme high values. This makes the boxplot meaningless (the points and even the "body" of the chart is too small), that is why I would like to show this on a log10 scale. I am aware that I could leave out the extreme values from the visualization, but I am not intended to.

Let's see a simple example with diamonds data:

m <- ggplot(diamonds, aes(y = price, x = color))

https://i.stack.imgur.com/aK2Ro.png

The problem is not serious here, but I hope you could imagine why I would like to see the values at a log10 scale. Let's try it:

m + geom_boxplot() + coord_trans(y = "log10")

https://i.stack.imgur.com/ifWhk.png

As you can see the y axis is log10 scaled and looks fine but there is a problem with the x axis, which makes the plot very strange.

The problem do not occur with scale_log, but this is not an option for me, as I cannot use a custom formatter this way. E.g.:

m + geom_boxplot() + scale_y_log10() 

https://i.stack.imgur.com/SUdX5.png

My question: does anyone know a solution to plot the boxplot with log10 scale on y axis which labels could be freely formatted with a formatter function like in this thread?

Editing the question to help answerers based on answers and comments:

What I am really after: one log10 transformed axis (y) with not scientific labels. I would like to label it like dollar (formatter=dollar) or any custom format.

If I try @hadley's suggestion I get the following warnings:

> m + geom_boxplot() + scale_y_log10(formatter=dollar)
Warning messages:
1: In max(x) : no non-missing arguments to max; returning -Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

With an unchanged y axis labels:

https://i.stack.imgur.com/sRScn.png

That's a bug in coord_trans - but you can specify custom labels to scale_y_log10...
Thank you @hadley, I should miss something but eg. + scale_y_continous(formatter=dollar) just do not work. I cannot see the result of any formatter given and I also get three In max(x) : no non-missing arguments to max; returning -Inf warnings messages.
@daroxzig: The examples I have seen for the formatter argument have all involved quoted names, so perhaps formatter="dollar"?
@DWin: I tried with quotes also, but the result is exactly the same.
Formatter doesn't work (yet) but you can still set the labels manually...

I
IRTFM

The simplest is to just give the 'trans' (formerly 'formatter') argument of either the scale_x_continuous or the scale_y_continuous the name of the desired log function:

library(ggplot2)  # which formerly required pkg:plyr
m + geom_boxplot() + scale_y_continuous(trans='log10')

EDIT: Or if you don't like that, then either of these appears to give different but useful results:

m <- ggplot(diamonds, aes(y = price, x = color), log="y")
m + geom_boxplot() 
m <- ggplot(diamonds, aes(y = price, x = color), log10="y")
m + geom_boxplot()

EDIT2 & 3: Further experiments (after discarding the one that attempted successfully to put "$" signs in front of logged values):

# Need a function that accepts an x argument
# wrap desired formatting around numeric result
fmtExpLg10 <- function(x) paste(plyr::round_any(10^x/1000, 0.01) , "K $", sep="")

ggplot(diamonds, aes(color, log10(price))) + 
  geom_boxplot() + 
  scale_y_continuous("Price, log10-scaling", trans = fmtExpLg10)

https://i.stack.imgur.com/vRKpS.png

Note added mid 2017 in comment about package syntax change:

scale_y_continuous(formatter = 'log10') is now scale_y_continuous(trans = 'log10') (ggplot2 v2.2.1)


Thank you @DWin, but this is not the one I was looking for. This way the y axis' labels will be converted to log10, but the axis will not be transformed. What I would like to get: one transformed axis (y) with not scientific labels.
@daroczig: See if this is more satisfactory. I would have sworn that the first time I ran my first solution that I got even powers of ten but I cannot reproduce. Maybe I was so focused on seeing the x-positions that I overlooked the obvious problems
@daroczig: The "successful experiment" with "dollarizing" used fmtLg10dlr <- function(x) dollar(log10(x)); m + geom_boxplot() + scale_y_continuous(formatter='fmtLg10dlr') , but it just looks "wrong" to me.
I suspect you're trying to do something like ggplot(diamonds, aes(color, log10(price))) + geom_boxplot() + scale_y_continuous(formatter = function(x) format(10 ^ x)) - you need to transform the data and back-transform the labels.
scale_y_continuous(formatter = 'log10') is now scale_y_continuous(trans = 'log10') (ggplot2 v2.2.1)
d
daroczig

I had a similar problem and this scale worked for me like a charm:

breaks = 10**(1:10)
scale_y_log10(breaks = breaks, labels = comma(breaks))

as you want the intermediate levels, too (10^3.5), you need to tweak the formatting:

breaks = 10**(1:10 * 0.5)
m <- ggplot(diamonds, aes(y = price, x = color)) + geom_boxplot()
m + scale_y_log10(breaks = breaks, labels = comma(breaks, digits = 1))

After executing::

https://i.stack.imgur.com/jAFcn.png


I just noticed this very similar problem has the same solution.
thank you for pointing my attention to this alternate solution which would be complete with specifying the simple dollar formatter or by writing a custom one: + scale_y_log10(breaks = breaks, labels = dollar(breaks))
T
Tung

Another solution using scale_y_log10 with trans_breaks, trans_format and annotation_logticks()

library(ggplot2)

m <- ggplot(diamonds, aes(y = price, x = color))

m + geom_boxplot() +
  scale_y_log10(
    breaks = scales::trans_breaks("log10", function(x) 10^x),
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  ) +
  theme_bw() +
  annotation_logticks(sides = 'lr') +
  theme(panel.grid.minor = element_blank())

https://i.imgur.com/nNvsOPv.png


Very elegant output
In 2020, this is the first answer that copies, pastes n' works. (Yes, I tried them all.) Thanks!
d
daroczig

I think I got it at last by doing some manual transformations with the data before visualization:

d <- diamonds
# computing logarithm of prices
d$price <- log10(d$price)

And work out a formatter to later compute 'back' the logarithmic data:

formatBack <- function(x) 10^x 
# or with special formatter (here: "dollar")
formatBack <- function(x) paste(round(10^x, 2), "$", sep=' ') 

And draw the plot with given formatter:

m <- ggplot(d, aes(y = price, x = color))
m + geom_boxplot() + scale_y_continuous(formatter='formatBack')

https://i.stack.imgur.com/L8HF6.png

Sorry to the community to bother you with a question I could have solved before! The funny part is: I was working hard to make this plot work a month ago but did not succeed. After asking here, I got it.

Anyway, thanks to @DWin for motivation!


I think formatter now changed to labels => stackoverflow.com/questions/10146109/…

关注公众号,不定期副业成功案例分享
关注公众号

不定期副业成功案例分享

领先一步获取最新的外包任务吗?

立即订阅