Old Man Yells at Cloud
You feel that you are watched when you are private. Even when you are not private you cannot choose your audience.
-- This Clouded Heart by Steven Jesse Bernstein
Background
When the lines above were written, in a pre-internet age, they sounded paranoid (perhaps deliberately so). Now, for many people they are literally true. This is mostly because we've accepted a truly astonishing amount of tracking and surveillance in the software and hardware that we use everyday. Part of what enables that tracking and surveillance is a change in direction at most software companies (that is to say, moving everything to the cloud).
Once upon a time, computers were enormous and filled rooms, so if people wanted to interact with them they did so through "dumb terminals": teletypes or monitors and keyboards that were remotely connected to the computer and had no local storage. All software ran on the mainframe, and everybody who needed to use the computer had to share time and terminals to use it.
Later as computing hardware shrank, the terminals got "smarter" and became thin clients that could store some info locally (or at least read from local disks), and eventually we got computers that fit on a desktop, and finally in our laps. The trend then was to write software you could run on your home computer because networks were absent or very, very slow.
Over the last 20 years, as network speeds have accelerated, we've seen the opposite trend: software is migrating back to centralized computers ("the cloud" is just someone else's computer, after all). This has plusses and minuses as an end-user. Some positives are that you can access the software (theoretically) from anywhere, and for some applications (like speech recognition) you can benefit from more powerful hardware and more complete datasets than you could ever have at home.
The downsides, however, are manifold. Pushing software to the cloud has allowed changes to software business models that are basically anti-consumer. Lots of software that doesn't benefit from being internet connected in any way (like word processors or pdf readers) have adopted a cloud-centered model to force customers into a subscription sales model, which boosts profits but provides very limited benefits to users (its literal rent-seeking for mature software suites). Some software collects data about you (sometimes without your knowledge or informed consent), monetizes it, and sells it, which you cannot prevent because the software will not work when disconnected from the network. Additionally many of these companies have lax security practices, collect more data than they (or you) realize they're collecting, or allow access to that data in ways they don't realize. One recent high profile example: the major digital assistants (Siri, Alexa, et al.) were using live humans for voice analysis but didn't tell anyone, so random Amazon contractors were listening to people having sex, having private arguments, etc.
Another unfortunate feature of the cloud-first model: if a company running cloud-based software goes out of business, stops doing business in your country, or decides they don't want to provide the service anymore (as Google does with various services a few times a year), you're out of luck and can't use the software anymore at all, even if it was working fine and could easily be run locally. In short, it's increasingly challenging to know or control what the software you run is doing, and if you like certain software, there's no guarantee you can keep using it.
The good news is that, for at least some categories of software, its still possible to run them at home, often with convenience and capability comparable to cloud-connected products. Free and open-source (FOSS) software in particular can be a good choice because it's not typically developed for profit, and tends to be more privacy-respecting because users can see what the code is doing. Open-source software is also typically available at no cost (although developers appreciate donations). I think it's important to "vote with one's feet" and try to use software that respects you as a user to slow the slide into a future where we have no control over our privacy at all. Sometimes there's no way out of cloud-connected products (I certainly still use some), but "the perfect is the enemy of the good": one can try to maximize privacy while accepting that some data-sharing is inevitable.
Privacy-maximizing software selection dichotomous key with examples:
So here's the process I go through when I'm looking for new software:
- Can you fruitfully do it on your own computer (either as a user
program or a self-hosted service) rather than "in the
cloud?"
- If YES, then look for an open-source program to do it (write documents in Libre Office on your computer rather than in Google Drive; host a Nextcloud instance rather than using Dropbox if you're technically inclined). If the open-source programs can't do what you need, look for a local program you can pay for (if you need a way to host and playback media, and FOSS software like Kodi won't work, there are locally hosted media servers you can pay for like JRiver or Emby).
- If NO, then go to 2)
- Cloud-connected projects that offer end-to-end encryption are also
preferable, because the cloud-service provider knows nothing about
your data, they just see encrypted blobs of gibberish that only get
decoded when you look at them. Can you find a commercial service
that meets your needs that's end-to-end encrypted?
- If YES, use that. (Signal is an end-to-end encrypted text messaging app that's a drop in replacement for the default text messaging app on your phone).
- If NO, go to 3)
- Software companies that derive most of their revenue from user
payments have less incentive to try to collect and monetize
sensitive data about you. Can you find a commercial service that
meets your needs that derives most of its revenue from user
payments?
- if YES, use that. (Paying for email service through pay services like Fastmail or Protonmail rather than using Gmail; Slack makes it money from its customers, while google's chat/collaboration functions are heavily subsidized by advertising).
- if NO, go to 4)
- If the only service that will meet your needs is a commercial
service that derives most of its income from data collection and/or
advertising, can you do without the service?
- if YES, give up.
- if NO, live with the privacy loss, but recognize that nothing you do on the platform is remotely private (i.e. company staff or even third-party employees may wind up reading your emails/DMs, etc.). The platform may also try to track your internet use even when you're not using the platform ("social" like buttons, pixel tracking, etc.). Mitigation tactics include deliberately providing false personal info, using a browser to login instead of an app (even on your phone), logging out whenever you're not actively using the service, using ad and/or script blockers in your browsers, and dumping your browser cache and cookies when you close your browser. These steps make life less convenient, and won't solve the big problems, but they will reduce how effectively sites that aggressively try to track you across the web, like Facebook, can track you when you're not actively using them.
Disclaimer
Any brands mentioned above are used as illustrations, not necessarily recommendations. I have no commercial (or other) relationship with any of the brands mentioned, and have received no consideration of any kind. I'm just a user of the software or services.