Google Sheets and the Onion Method

Working with pasted data, without editing every field.

10/26/20235 min read

Moriah's spreadsheet for comparing vacation options, titled "McLean trip 2024"
Moriah's spreadsheet for comparing vacation options, titled "McLean trip 2024"

I'm working on scheduling a trip with the help of my friend and travel agent, Emily. There were a couple tour companies she recommended, which I was thrilled to look at and overwhelmed by all the choices! I am a data nerd, so I wanted to be able to compare quantitative info about each trip on offer, plus be able to compare everything side-by-side. Pasting the details into a Google Sheet made the most sense, as I'd be able to strike through (delete) the options (rows) that were no longer on the table. But I encountered a funny problem when it came to the dates.

An overview of Moriah's spreadsheet, with date ranges in a variety of formats.
An overview of Moriah's spreadsheet, with date ranges in a variety of formats.

The dates were in one line. Fine, I thought, I'll split these lines so I have one column of start dates and one column of end dates.

The SPLIT() function has been used on the dates in the sheet, but the days of the week are off.
The SPLIT() function has been used on the dates in the sheet, but the days of the week are off.

Well, that didn't work as well as I'd hoped. I used the function SPLIT(O2, "-") to split the data in column O. This worked okay, except the days of the week were off. It took some tinkering, and changing the date format to display a lot more information, but I eventually realized that Google thought my dates were in 2023 instead of 2024. While it would be easy to append "2024" to the end of the date range and then split it, that would only cover my second date. I still needed to somehow insert "2024" into the middle of the string in order to get the first weekday right.

So I split my text, then found the TEXT() function. TEXT() requires two arguments--the origin cell, plus the desired formatting. For me, this was TEXT(P14, "mm/dd"). At this point, I had moved from my actual data into a section where I could play around with dates without messing up my carefully collected data!

Then, I was able to use CONCAT() to add another slash and the correct year. CONCAT() is short for concatenation, which is a way to say that you want to "add" two text strings together. And by "add," I mean place them side by side. My arguments here would be CONCAT(P15, "/2024") to get my result into a format of MM/DD/YYYY.

Great! Now I had a date! One big improvement would be to see the day of the week, though. I used the TEXT() function again to re-display the date. If dd is the code for day date, ddd is the code for the abbreviation for the day of the week, and dddd is the code for the full name of the day of the week. Similarly, "m" is the simple month (January = 1), "mm" is a two-digit month (January = 01), "mmm" is short month name (January = Jan), and "mmmm" is full month name (January = January). To have the day as a text abbreviation in front of the numeric month and date, and skipping the year, I used TEXT(P16, "ddd mm/dd"). At this point, I went ahead and recycled my formula for the end date as well.

Now I had the format I wanted! I just had to get it to fit into one formula. Hmm... this part might be tricky. I vaguely knew about the ability to stack formulas in Sheets. To review, I wanted to use the following formulas (which I've color coded here):

  • (P17) =TEXT(P16, "ddd mm/dd")

  • (P16) =CONCAT(P15, "/2024")

  • (P15) = TEXT(P14, "mm/dd")

  • (P14) = SPLIT(O14, "-")

I could figure out for the most part how to substitute:

(P17) =TEXT(CONCAT(TEXT(SPLIT(O14, "-"), "mm/dd"), "/2024), "ddd mm/dd")

This technique of substitution is called the Onion Method, because you insert data in multiple layers. And, it kind of looks like an onion with all those parentheses!

That didn't quite work. I had the first date formatted the way I wanted it, but I still needed the end date. I did some Googling and found out how to access the results of a SPLIT() function using INDEX(). If SPLIT() makes an imaginary table that doesn't get printed as output, INDEX() lets us access that table, aka the result of the SPLIT(). INDEX(SPLIT(...), A, B) would theoretically provide the output of the SPLIT that's in imaginary Row A, Column B. In other words, SPLIT("Oscar the Grouch", " ") is a string split by spaces, resulting in "Oscar", "the", and "Grouch". If we wanted to access "Grouch", which is in row 1, column 3, we could use INDEX(SPLIT("Oscar the Grouch", 1, 3). This would be the secret sauce to getting where I needed to be!

Finally, I could use the INDEX() formula to choose exactly which element of the SPLIT() I wanted to access. I had to do this in two formulas: one for a start date, and one for an end date.

Start date: =TEXT(CONCAT(INDEX(TEXT(SPLIT(P14, "-"), "mm/dd"), 1, 1), "/2024"), "ddd mm/dd")

End date: =TEXT(CONCAT(INDEX(TEXT(SPLIT(P14, "-"), "mm/dd"), 1, 2), "/2024"), "ddd mm/dd")

I applied the formulas to all my data and all was well. Or so I thought! I decided I should probably add the length to the dates so I could see at a glance how many days each excursion lasted. I did a simple subtraction formula to calculate length (=Q2 - P2), and dragged the formula to the bottom.

Somehow, Thu 02/29 wasn't being viewed as a date. This was easily fixed when I added the year back in. Had I not had a Leap Day date, I probably wouldn't have noticed! To check my work, I added a date range that included February 29.

With this check, I noticed another mistake: isn't February 28 to March 1 three days in 2024? By subtracting one date from another, I was doing an exclusive date range, rather than an inclusive range. In other words, if you're traveling the two days of March 10th and 11th, the program would calculate that you're only gone one day: 11 - 10 = 1. There is probably a simple formula to calculate an inclusive date range, but for my purposes it was easy enough to subtract dates and add one to the result.

The final thing I'll point out, which you may have noticed, is that this data manipulation is able to handle dates in a variety of formats. As long as the data from the SPLIT() function can be recognized as a date, it's perfectly usable by Sheets! So I was able to use data from several different sources, regardless of format, and plug it in.

I love puzzling out tips and tricks! Ultimately it may have been faster if I'd just manually edited each field, but I wouldn't have learned new skills that way.