Last updated on

Introduction and Postmortem: SQL Power Users


Jason Cohen said something interesting a few weeks ago on Twitter.Jason’s writing is very good. And I think this is largely the correct thing to do. The one tweak I’d suggest, and maybe he already does this, is to label the writing that he now disagrees with.

I think there’s value in sharing how your thinking has changed over time. So I’m going to test that out here. Below is the announcement post I published last year about a new side project. I’m going to do a quick postmortem here, do some small edits on the original (as I would on any old work I revisit regardless of outcome), and then keep most of the original in place.

The idea

Late 2022, I was looking for a side project. I settled on a documentation site for people who use SQL (in short, a language you use to query databases). Here are a few of the things I was looking for in a project:

  1. Something I could work on outside of work hours that my company wouldn’t think was a distraction.
  2. Something that would be simple and complete if I decided to move on after a few months.
  3. Something that could later monetize if I enjoyed working on it.

I figured there was a good chance this would be a toy project, just to some reps building something quickly. This idea seemed fine because I could do all the work myself, I had connections in the space, and it could be an asset in future job searches.

Pre-launch

I conducted a few “Mom Test” interviews, where you try to listen to folks’ problems instead of leading with your idea. At the end of a few calls, I shared one example page. The feedback wasn’t great, and I didn’t consider the idea validated. All the interviews were done outside of work hours, which impeded scheduling with a few other people in my network.

Rather than do more interviews, I decided to just build out a small site quickly. I’d start with ~40 SQL functions (tailored to the Snowflake data warehouse) and two blog posts. I don’t know if I learned the right lesson here. The act of doing the thing and putting it out there if it doesn’t take much time or money (the only money I spent was a few bucks on the domain) seems fine. Whether I should have done that after the inconclusive interview feedback, I’m still not sure.

I aimed to get the site live in 10 days. It took about 10 days of work, but the problem was those 10 days were distributed in three bursts at the end of December 2022, mid January 2023, and beginning of March 2023. My excitement had faded in December. I wasn’t motivated to work on it, and then procrastinated on other stuff because I felt guilty about ignoring this side project.

There were two other things worth mentioning in this section. The first was that I was curious if Snowflake would pay me anything to send people to their site. I contacted a few members of the team, but that didn’t lead anywhere. That was probably getting ahead of myself. The second was the release of ChatGPT between when I had the idea and actually released it. Even though I didn’t use ChatGPT for SQL help, I saw the writing on the wall that a big segment of potential users would get a better experience from an LLM. And for something that hoped to grow in part from SEO, I suspected that many search queries would go to ChatGPT before Google.

Post-launch

In March 2023, I shared the site in a popular data Slack community. Sharing my work, guest posting on other blogs, speaking at conferences, and interviewing data people for the site were all ways I was planning to grow (and collect backlinks for SEO).

My initial plan from there was to keep adding docs, blog posts, and tutorials to the site, and get traffic in a few ways: post in data Slack communities, guest post on other blogs, speaking at conferences, and doing interviews with data people with strong networks.

There were a few things I learned from the early data. First, my hypothesis that it would be really easy to rank on Google for Snowflake function docs was correct. I had several pages rank on the first page of Google pretty quickly. Second, site visits were really short and I had a really high bounce rate. This type of content probably inherently has a pretty high bounce rate, but I was still concerned that this meant that visitors were not finding the content valuable.

Getting my name out via guest posts and conferences had mixed results. I wrote once for Locally Optimistic, which was my favorite data blog. The editorial team there was great, and the piece came out a lot better than if I had just worked on it alone. It took a few months to get published, largely because I could have moved faster. I believe I was welcome to write again, though there hasn’t been anything data-related I’ve intrinsically wanted to write about. Conferences were tougher. My pitch for dbt Coalesce 2024 was rejected, and some other data conferences were not as friendly to less-established speakers.

A surprise was that while most people I reached out for informational interviews in December said yes, none of my interviewees wanted to have their career story featured on the website. I set that aside as a potential growth lever for later on when it may be more appealing.

As for adding new content, I also realized that I was working on what I was best at instead of what would actually yield the best results. But even then I wasn’t enjoying it, so I admitted to myself I wanted to move on.

Final Thoughts

JA note: I’ll give this one more pass in a few weeks for more conclusions.

Alright, that’s it for the post-mortem. Here’s what I wrote back then.

In November, I mentioned briefly the thought of starting a dedicated data blog separate from this site. Several weeks later, I decided to give that thought additional consideration and am excited to share SQLPowerUsers.com with you. It includes a blog component, but there’s more.

First, some background.

SQL has evolved, but not the content

I use SQL, the language used to query relational databases, everyday on the job. It’s been around since the 1970s, and because of that, there’s plenty already written about it on the internet.

However, a lot of that content is old, outdated, and not very useful to people working in data science and analytics today. Technologies evolve over time – and new, more powerful database technologies such as Snowflake, Google BigQuery, and Amazon Redshift look very different in some facets. And there is demand for relevant and helpful information.

I observe three main types of content that are being produced now to meet this demand.

Content marketing

Data startups produce a lot of content marketing and technical blogs (as they should!). Overall, the quality is pretty good — often they’re by qualified freelancers. Data folks kept telling me, though, that they’d read these posts, and once they reach the ad for the company’s product at the bottom, and then wonder if they’re just being steered towards buying something.

One-off blog posts

There’s a lot of one-off posts on sites like Medium (sometimes personal websites), of varying quality. In general, they’re overly complicated — due to some people trying to look impressive while job hunting.

Newsletters

There’s a lot of Substacks and newsletters out there. I already read a few regularly, but for research purposes, I tried 10 others. To my surprise, they were all pretty good. It’s awesome for the community that a lot of data people turn out to be awesome writers as well. A few small nits to pick from what I saw:

  1. It’s hard to keep up the habit of writing consistently over a long period of time. There’s already a graveyard of half-finished archives, which I suspect will worsen with time as writers move on to the next project.
  2. Newsletter posts often jump from topic to topic, whatever is top of mind, so the archive ends up more of a bottoms-up blend rather than one top-down, cohesive piece of work. My instinct tells me the bottoms-up approach is less evergreen.
  3. There’s a lot of thought leadership in data Substacks. Most of it is good. Sometimes it feels a bit detached from the day-to-day problems we face on data teams, but I’m glad people way smarter than me put out the work they do.

That was a lot of words. Here’s my takeaways that I am acting on:

  1. Content marketing often picks the right topics, but readers will still take it with a grain of salt if they think they’re just being sold something. Not selling something helps build trust.
  2. Some posts on venues like Medium gear up the complexity to make the author look good at the expense of being actionable to the reader. Not all posts need to be actionable, but some should.
  3. Most newsletter writers will probably move on to other projects, leaving an incomplete and maybe non-cohesive, bottoms-up archive. Having a top-down, evergreen content plan may be a differentiator.

Content Mix

Now back to SQL Power Users. The vibe I’m going for with content is: authentic, actionable, and cohesive. Basically the three takeaways from the last section. Three types of content stood out as fits.

Pareto SQL Docs

Bear with me, as I’m probably going to iterate on that name. All data people Google SQL questions. If you’re working with one of the newer data warehouses I mentioned before (such as Snowflake, BigQuery, or Redshift), most of the search results won’t be relevant.

For Snowflake, for example, the official Snowflake documentation is pretty much the only game in town for function help. It has a ton of valuable info. But nobody I spoke with (including myself) has a pleasant experience with the official Snowflake docs. It can be dense, and the examples aren’t always a great fit for the use cases we want. That’s not really a slight on Snowflake – they have to write to a much higher denominator for the official documentation.

So I wrote my own function documentation for Snowflake. With much less text than the official docs, these convey 80% of the info (hence the name Pareto). I tried to be clearer with word choice, and the examples are the use cases that I actually use the functions for.

Will people use these? I will, so at worst this is just going to improve my speed at work. Will others? Remains to be seen. The top few pages of Google search results don’t show others doing this, so there is some opportunity to rank for long-tail keyword combinations. I don’t expect to outrank Snowflake (which has pretty good SEO), but I know that when I look for SQL help, I usually end up visiting multiple search results (sometimes Snowflake and something else).

Authentic: I use Snowflake everyday and am a big fan of the product. These will improve my experience using it.

Actionable: Hopefully, these docs are good enough that readers can directly apply the info into their SQL queries.

Cohesive: Instead of writing a one-off blog post about how to use a random SQL function, I documented about 50.

Guides and tutorials

As I said earlier, I don’t want to write up articles that are overly technical just for the purpose of looking impressive, and I’m happy to leave thought leadership to much smarter people than myself. That leaves plenty of room on the continuum for some practical posts on subjects I have relevant knowledge and experience in, starting with Your First 90 Days as a Startup’s First Data Hire. I haven’t decided what the balance will be in terms of technical vs. non-technical, but I expect a mix of both.

Authentic: I plan to write only on topics that I am (moderately) qualified to speak on — mostly meaning I have done it myself before.

Actionable: Topics are prioritized to be actionable.

Cohesive: An advantage of being the sole contributor is it is much easier to not contradict myself across posts.

Interviews

One of my favorite parts of writing for my college newspaper was telling people’s stories. There are definitely some podcasts in the data space that do this, but I’d prefer to focus on the written medium. I’ve got a few super interesting folks lined up, and I’m excited to use this opportunity to get to know some other data people better, too.

Most early-career people in data roles want to pivot in some way — whether to another data role (often data scientist or data engineer) or a non-data role (like product manager). Hopefully there will be some content here that helps with that decision.

Authentic: I will have a high bar and be very purposeful about who gets featured here.

Actionable: I think these are going to be the least actionable pieces of content on the site, but I believe they have the potential to be very valuable in the less frequent cases they are very relevant to a reader.

Cohesive: Adds back the human element to the previous two types of content. Also is more easily able to address topics like careers that may be interesting to the reader.

What, what’s in it for you again?

Well, the site is free, so I don’t expect to make much money off this. Here’s a few ways I think this project will be a productive use of time.

  1. I’ll get some more reps writing, including some technical writing.
  2. I can demonstrate competence in an area where I’ve been spending a lot of time the last few years. Plus, I can share some learnings and thoughts that would’ve helped me a few years ago. It’s like the portfolio I never made.
  3. I can learn more about SEO, and make it a contest with myself to rank as high as I can for some searches.
  4. Through the interviews, I may meet some really smart people. It also opens up a point of discoverability if someone reads and decides to get in touch.

Where do you go from here?

I think there are some good foundations in place but the content library still has some clear opportunity areas. Here’s to writing some better SQL content these next few months.