PgDog is funded and coming to a database near you
Posted by levkk 6 days ago
Comments
Comment by eikenberry 6 days ago
I've used Postgres at a few places and the #1 problem was always high availability, not scaling. One Postgres cluster could easily handle 100000 transactions per minute, but when a primary node went down it was a page and manually failing over to the spare then manually replacing the spare. The manual tooling was very finicky but at least it worked, no automated solution came even close. Lack of a good HA story is why I avoid self-managed Postgres as much as possible.
Comment by levkk 6 days ago
Load balancer with health checks and failover, works out of the box. :) Battle-tested at this point too, so could be worth a look.
Comment by r7n 6 days ago
Comment by bofaGuy 6 days ago
Comment by ngc248 5 days ago
Comment by moomoo11 6 days ago
and yeah you have to spend a lot of upfront time designing your data models
Comment by zamalek 6 days ago
Comment by r7n 6 days ago
Take for example AuroraDB: the sheer engineering it took to make SQL do scalable OLTP at all tells you how much that flexibility actually costs to keep.
Comment by ah27182 5 days ago
Comment by jsw 6 days ago
Comment by r7n 6 days ago
So much so, we re-wrote the DynamoSDK to squeeze out more optimizations so we could be the same cost even though we were a layer in front of dynamo. We used key encoding and other various technique as well as managed capacity (on demand vs reserved) to transparently optimize workloads for price. In our experience we saw dramatic gains vs just vanilla SDK usage.
If you're curious, here was the marketing website, but we're now part of Databricks: https://stately.cloud/
Comment by jsw 6 days ago
Comment by cherioo 6 days ago
I don’t think the backend matters. It’s the frontend wrapper that makes or breaks HA.
Comment by inigyou 6 days ago
Comment by doctorpangloss 6 days ago
Comment by gchamonlive 6 days ago
Comment by doctorpangloss 6 days ago
> PgDog does not detect primary failure and will not call pg_promote(). It is expected that the databases are managed externally by another tool, like Patroni or AWS RDS, which handle replica promotion.
Comment by nikolatt 6 days ago
Comment by znpy 6 days ago
Comment by doctorpangloss 6 days ago
Comment by dev-ns8 6 days ago
Comment by dotancohen 6 days ago
Comment by inigyou 6 days ago
Comment by eikenberry 6 days ago
Comment by MeetingsBrowser 6 days ago
Comment by eikenberry 6 days ago
Comment by parthdesai 6 days ago
Comment by nijave 6 days ago
Comment by eikenberry 5 days ago
Comment by globular-toast 6 days ago
Comment by nijave 6 days ago
Comment by globular-toast 6 days ago
Comment by nijave 5 days ago
It's a little easier to strip down userland if the machine is only running PG. Technically possible on k8s with distros like Talos, Bottlerocket, etc but you still have all the k8s deps on top of PG. It's also a little easier to do defense-in-depth on a dedicated PG machine which means you might have mitigating controls in place to skip security patches (minimal kernel modules, selinux)--possible on k8s but now you're fighting through a 2nd layer of configuration
RDS is a bit of a special case because you also have AWS curating and prioritizing updates. You can do that yourself but it's a bit of a time sink scrutinizing every upgrade to see if you _really_ need it. Our RDS instances tend to go 3+ months without restarts
Comment by tempest_ 6 days ago
Comment by pinkgolem 6 days ago
Comment by VirusNewbie 6 days ago
Comment by inigyou 6 days ago
Comment by ahachete 5 days ago
I'm really happy that there's more options for Postgres sharding and I applaud Pgdog and the team's efforts and energy.
Having said that, this makes it a no-go for me:
> shard_number = hash(data) % num_shards
https://docs.pgdog.dev/features/sharding/basics/#terminology
Most sharding solutions distribute the hash value over linear ranges, that then split across "virtual shards", that are then placed on the physical shards or worker. This allows for shard replacement when needed. For example, Citus works this way, and even adds convenience functions for shard migration (using logical migration) in an automated way. That's all I'd need.
Operationally, it's worlds apart. With modulo distribution the only way to replace data is to reshard everything --something you don't want to do however fast the operation may be.
Comment by levkk 5 days ago
Adding a sharding function in our architecture is relatively straightforward. We also support plugins which can control the flow (and direction) for queries, so our users can add their own (and they do!).
Comment by ahachete 5 days ago
* Adding a sharding function, as you say.
* Developing an external service for metadata (shard placement) or alternatively have that metadata in one place and replicate (consistently!) to every query router.
* Implementing functions/catalogs for the users to understand the placement and configure/alter it.
* Implementing shard migration / rebalancing capabilities, possibly using Postgres logical replication (plus notable automation).
Here's one idea if you follow this path, something that Citus doesn't have: make the sharding function pluggable and pick one by default which is well-known and available in many languages (e.g. xxhash). If you do so, and guarantee stability of those functions, they could be used externally (applications) to route queries / inserts especially to the appropriate shard. While it makes application more complex, it may allow (combined with access to the metadata service) for faster ingestion paths (this is often known as application assisted sharding), and its not exclusive of the query routers.
Edit: formatting
Comment by codegeek 6 days ago
Couldn't be a better why us :)
Comment by aurareturn 6 days ago
Comment by tomtomtom777 6 days ago
Instacart doesn't need "100,000s of grocery delivery orders per minute".
There must be some 0s added for the sake of the story.
Comment by true_religion 6 days ago
It might make 100k row level changes per minute, but that’s a different metric.
https://www.sec.gov/Archives/edgar/data/1579091/000157909126...
Comment by FinnKuhn 6 days ago
I assume they are referring to how many database requests they have due to customers orders or a similar metric and just worded it poorly.
[1] https://www.kaggle.com/datasets/psparks/instacart-market-bas... [2] https://rstudio-pubs-static.s3.amazonaws.com/284199_5c498037...
Comment by aeyes 6 days ago
Comment by FinnKuhn 6 days ago
So I still assume the original comment isn't referring to actual orders placed.
[1] https://www.kaggle.com/datasets/psparks/instacart-market-bas... [2] https://fortune.com/2022/05/18/what-to-know-instacart-ipo/
Comment by andriy_koval 6 days ago
Comment by tomtomtom777 5 days ago
Comment by ktm5j 6 days ago
Comment by gaucheph 6 days ago
Comment by willio58 6 days ago
Comment by UqWBcuFx6NV4r 6 days ago
Comment by true_religion 4 days ago
I can't say what the curve looks like, but 100,000 orders per second would consume reach official quarterly count in 15 minutes.
Since that's unlikely, this at least gives us some degree of bounds to guess what the curve looks like.
Comment by dotancohen 6 days ago
Comment by aurareturn 6 days ago
Comment by dotancohen 6 days ago
Comment by aurareturn 6 days ago
No clue how a shopping cart or checkout flow would drastically increase database load. It should just be basic CRUD. Building a shopping cart is something every student makes. Pages in a web store can be cached relatively easily since items won't change often.
A primary DB with a few replicas and caching can go a really long way.
Comment by chatmasta 6 days ago
There’s challenges scaling read-heavy workloads, for sure — but they’re generally more straight forward than scaling write-heavy workloads. You can get away with more dumb horizontal scaling than with writes.
Comment by willdr 6 days ago
Comment by aurareturn 5 days ago
I think one piece that someone else mentioned could require DB sharding and that is all the live data needed for tracking deliveries.
The actual website/app should not need more than one beefy Postgres instance.
Comment by inigyou 6 days ago
Comment by nijave 6 days ago
Could just be looking at the "orders" endpoint in their app which might also include incremental updates as shoppers get items from the store. It's a fairly ambiguous statement
Comment by smt88 6 days ago
Comment by aurareturn 6 days ago
Comment by smt88 6 days ago
Comment by aurareturn 6 days ago
Everything else seems normal DB CRUD that a single beefy instance with a few replicas should handle easily. Type ahead search is no doubt using a different service and not directly querying Postgres.
Comment by outworlder 6 days ago
Comment by nine_k 6 days ago
Comment by qaq 6 days ago
Comment by inigyou 6 days ago
Comment by azinman2 6 days ago
Comment by paoliniluis 6 days ago
Comment by chrisvenum 6 days ago
Right now I have a project that has very heavy write traffic from multiple services and a web app that reads from this. We are starting to hit the point where no amount of indexing, query optimisation, caching or box upgrades is helping us. We are looking at maybe moving the bulk of the static data to clickhouse to reduce the DB size but I would love to hear if PgDog or other kind of sharding could be useful for this use case.
Comment by levkk 6 days ago
That's exactly right. Get in touch (lev@pgdog.dev), happy to help or at the very least tell you what current works (or doesn't) so you know what your options are.
Comment by inigyou 6 days ago
Because it's not magic, you do still have to know what's going on under the hood, e.g. no cross-shard transactions.
I'd see if my application can benefit from read replicas before doing sharding, because sharding is difficult (if you care about data consistency). With replicas, each replica does have a full copy of the data and you only write to the master - you have to decide which transactions are suitable for running against replicas, which can lag slightly behind realtime. E.g. reading data to build a webpage is probably safe to do from a replica - any read-modify-write is not.
Comment by levkk 6 days ago
Comment by inigyou 6 days ago
They also reduce the benefit of sharding, possibly down to worse performance than a non-sharded DB.
Comment by levkk 5 days ago
Comment by yabones 6 days ago
Comment by tempest_ 6 days ago
This is for DBs that are ~1-1.5TB but doesnt have a huge amount of churn/qps
Effectively what is described here https://www.pgedge.com/blog/always-online-or-bust-zero-downt...
Comment by tux3 6 days ago
If you use something like CloudNativePG they automate parts of the process with cli tools and declarative syntax. Otherwise you take the time to figure it out by hand. It might sound complicated, but just practice on your staging DB, and if all goes well you do the same procedure in prod.
Edit: Apparently Postgres 19 has a patch for one-shot logical replication of sequences! https://www.depesz.com/2025/11/11/waiting-for-postgresql-19-...
Comment by paulryanrogers 6 days ago
Comment by boxed 6 days ago
Comment by jeltz 6 days ago
For both MySQL and PostgreSQL you will need to use some kind of logical upgrades if you want no downtime.
Comment by ComputerGuru 5 days ago
(For example, ports under FreeBSD doesn’t let you install multiple Postgres versions as they are marked as conflicting packages so installing one necessarily uninstalls the other. The saving grace here is that most (virtually all) FreeBSD installations have root on ZFS and you can employ ZFS snapshots (via the hidden .zfs folder) to access the old binaries after upgrading to the new postgres version, but not many people know this trick!)
Comment by tomnipotent 6 days ago
Comment by jeltz 6 days ago
Comment by boxed 6 days ago
Comment by jeltz 6 days ago
Comment by evanelias 6 days ago
Comment by Blackthorn 6 days ago
Comment by jeltz 6 days ago
Comment by znpy 6 days ago
At this point i wonder if i'll ever see that.
Comment by hasyimibhar 6 days ago
For example, with Multigres, you should be able to achieve true zero downtime major version upgrade by simply resharding [2]. With vanilla Postgres + pgBouncer, you can only achieve near-zero downtime (few seconds at most), though it's probably good enough for most use cases.
[2] https://multigres.com/docs#migrate-across-postgres-versions
Comment by znpy 4 days ago
According to they githyb (https://github.com/multigres/multigres) as of today (June 12th, 2026):
> Multigres is a Vitess adaptation for Postgres. The project is currently in the early stages of development.
Maybe it works, maybe it doesn't. I would start looking into it when it gets released as stable. Otherwise it's unfair.
Comment by pgedge_postgres 4 days ago
Comment by jjice 6 days ago
Comment by hylaride 6 days ago
Multi-master is hard. The main issue is what to do with commit/replication lag. It's far "easier" if support for eventual consistency is ok with your use case. In some cases it's not. Also, the problems related to read-only lag can happen on multi-master instances. If somebody does a giant long running query on one of the masters, the target instance needs to hold the data state for the query, even if the underlying DB is getting updates. It also needs to still keep up with other masters. This means the whole cluster can slow down if the multi-master replication is synchronous. Depending on a variety of factors, that can chew up disk space, memory, etc.
There are ways of dealing with these issues (and others), but it comes with tradeoffs with performance, etc.
Comment by aynyc 6 days ago
Comment by evanelias 6 days ago
[1] https://mariadb.com/resources/blog/upgrade-now-announcing-my...
Comment by timacles 6 days ago
Multi master, from even a conceptual perspective, is incredibly complicated. Databases, transactions, consistency, parallelism are all very complicated.
It’s something that always seems promising at the start but as soon as maintenance and long term improvements enter the picture(ie integrating new Postgres versions), the complexity becomes too much.
Comment by tschellenbach 6 days ago
Comment by briffle 6 days ago
Comment by tschellenbach 6 days ago
Comment by welder 6 days ago
Comment by paulryanrogers 6 days ago
Comment by Ozzie_osman 6 days ago
We sharded over 20 TB that we know about.
This is probably a typo, right? 20TB isn't that big. I would imagine they've sharded a lot more than thatComment by dujuku 6 days ago
Comment by inigyou 6 days ago
Comment by ComputerGuru 5 days ago
Comment by Ozzie_osman 6 days ago
Comment by ubercore 6 days ago
Comment by GiorgioG 6 days ago
Comment by mplanchard 6 days ago
Comment by returningfory2 6 days ago
Comment by jeltz 6 days ago
Comment by tingletech 6 days ago
Comment by happyopossum 6 days ago
Comment by singron 6 days ago
Comment by rbranson 6 days ago
Comment by aejm 6 days ago
Comment by levkk 6 days ago
1. Control plane to manage multi-node deployments; "works out of the box" experience to make PgDog easy to deploy and use
2. QoS (quality of service): automatically block bad queries from taking down the database
Last but not least, you get SLA-backed support from us (up to P0).
New features are broken down into two categories:
1. Sharding / running Postgres at scale: always open source.
2. Infra management / making it easy to run PgDog at scale: enterprise.
Comment by underdeserver 6 days ago
Comment by aejm 6 days ago
Comment by moralestapia 6 days ago
Wrt. the pooler, how do you compare with pgbouncer?
I'm interested because I have a postgres instance, low-traffic but still like ... tens of r(eads)ps. I was not running anything close to the machine limits but still added pgbouncer to improve performance and didn't see a noticeable difference. I was stress-testing the machine obv., I'm not talking about the 10 rps, lol.
For context, my numbers were something like 10k rps +/- 1k vanilla postgres and like 9k rps +/- 1k with pgbouncer in front of it. So ... slightly slower but big error bars so I wouldn't say for sure. I ended up not using pgbouncer as the benefit was immaterial.
Also yeah, in case you want to check it out, it's the db that backs this project: https://httpstate.com.
Comment by levkk 6 days ago
Comment by directionless 5 days ago
Comment by levkk 5 days ago
Comment by karolist 6 days ago
Comment by levkk 6 days ago
Comment by ParadisoShlee 6 days ago
Comment by kjuulh 6 days ago
Also had an issue with it because it cached authentication requests when doing passthrough it seems, I'd changed the roles password, but it kept using the old one, which was no bueno ;).
PgDog seems to make more sense when you really care about a few databases that need massive scale, rather than a simple proxy in front of postgres. I'll keep following the development though, it is much needed in this space, postgres can use all the investment it can get to get it past the single machine scale that it excels at currently.
Comment by levkk 6 days ago
We'll get there.
Comment by maherbeg 6 days ago
You could also build a watcher side car that watches for changes of the pgdog_users.toml and have pgdog refresh itself then too with this combination. We thought about that but prefer to control the reloads for our needs.
Comment by apt-get 6 days ago
Comment by drchaim 6 days ago
If you’re already sharding by tenant for other reasons, OK… But I see CDC to a true OLAP system as more scalable.
PostgreSQL still needs real columnar tables in the core, hopefully one day
Comment by levkk 6 days ago
SELECT tenant_id, COUNT(clicks)
FROM users
GROUP BY tenant_id
ORDER BY 2 DESC
LIMIT 25;
Performance is a side effect - definitely needed and we'll do everything we can, but we are not competing with ClickHouse or Snowflake - just trying to make sharded Postgres work with your app.Comment by vira28 6 days ago
So there is more core work happening on support OLAP but I do think it will take some time.
In the meantime, I think we have all the pieces (storage, query engine, table format) to set up a true OLAP. For instance, I created https://github.com/viggy28/streambed to pressure test this idea.
Comment by christoff12 6 days ago
Comment by gen220 6 days ago
Edit: It also might be interesting to point out how your solution differs from what the folks at Planetscale are building https://planetscale.com/neki
Comment by parthdesai 6 days ago
1. Neki as you mentioned 2. PgDog 3. Multigres, headed by original creator of Vitesse
Comment by frollogaston 6 days ago
Comment by parthdesai 4 days ago
With proxies like pgdog, multigres, and eventually Neki, these can scale out horizontally, so you get true unlimited scale.
Comment by frollogaston 4 days ago
Comment by levkk 4 days ago
Comment by mnbbrown 6 days ago
Comment by frollogaston 6 days ago
Comment by mijoharas 6 days ago
Just to say we're happy pgdog users here! One feature we quite like (of the proxy) is the handling of different connection settings per connection (i.e. statement_timeout). When we investigated RDS proxy (ages ago) it wasn't supported, I think the same was true for pgbouncer so it required a bunch of application changes. With pgdog, it just works transparently.
Comment by levkk 5 days ago
Comment by welder 6 days ago
1. pool exhaustion from idle connections inside open long-running transactions
2. SQLAlchemy's client-side pool using dead connections that PgBouncer had already killed, causing periodic request errors
3. Some tasks have to bypass PgBouncer when they use SET or prepared statements
I've already sharded large datasets at the application layer, but looks like PgDog solves the above problems for any future work?
Comment by frollogaston 6 days ago
#2, shouldn't the client<->PgBouncer connections stay open?
#3 is why I just use client-side pools instead of PgBouncer, but that gets annoying when you have a replicated service so you have to think about the sum of connections across all pools, so I get why people use PgBouncer.
Comment by tempest_ 6 days ago
I had to disable application pooling as it was causing read only transactions I could couldnt pin down the cause.
Comment by htrp 6 days ago
Still trying to figure out how this works technically, is the performance gain really just re-write in rust?
Comment by levkk 6 days ago
Edit:
Performance gains are from having the ability to load balance reads (horizontal scaling for read queries) and scale out writes (with sharding). Once instance bottleneck in Postgres has many faces:
1. Behind schedule vacuums because of too many dead tuples (too many writes)
2. The WALWriter is single-threaded and IO-bound - Postgres can only do about 200-300MB/sec in writes per instance (real prod numbers on EC2 with NVMes and ZFS, basically best case scenario).
3. Bulkheading: single primary is a single point of failure. With 12 primaries, if one fails, 91% of your customers don't notice.
The list goes on. Rust is just a side effect. We love it because it's fast and correct - the perfect match for a database product.
Comment by hylaride 6 days ago
Comment by VeninVidiaVicii 6 days ago
Comment by levkk 6 days ago
Comment by jeremyjh 6 days ago
This is years of product development with a three person team. If Enterprise sales and support are a big part of your business plan it will suck up a lot more than that.
Comment by simonw 6 days ago
Is there a binary I can run directly?
Comment by levkk 6 days ago
Comment by simonw 6 days ago
Comment by e12e 6 days ago
Then again, sharding on a single host probably isn't very useful anyway - but it might work with docker in swarm mode?
Comment by levkk 6 days ago
Comment by frogbydjsd 6 days ago
Comment by maherbeg 6 days ago
Comment by netswift 5 days ago
Comment by kstrauser 5 days ago
Comment by valorzard 6 days ago
My question is, has any of them been talked about being upstreamed to postgres itself? Or, adding a custom built in feature to postgres itself?
Comment by levkk 6 days ago
Comment by inigyou 5 days ago
Comment by floriferous 6 days ago
Comment by jeremyjh 6 days ago
Comment by levkk 6 days ago
The same old processes vs. threads debate, plus having the ability to scale the coordinator past a single machine. So, if you're OLTP, definitely consider PgDog. OLAP - Citus still wins because of its advanced query engine. We'll get there.
Comment by ahachete 6 days ago
Since Citus v11 (released 4 years ago), any worker node can also work as a "query router" (a node that you can query against [1], and works from this perspective as a pure coordinator:
> for very demanding applications, you now have the option to load balance distributed queries across the workers
You can also setup such query routers as dedicated nodes by setting the `shouldhaveshards` to `false`, becoming an effective coordinator (for querying; not for metadata operations).
So with Citus you can absolutely have as many query routers (coordinators if you wish) as you want.
[1]: https://www.citusdata.com/updates/v11-0/#metadata-sync
Edit: formatting, typo
Comment by jeremyjh 6 days ago
TLDR: Tokio concurrency > Process concurrency in OLTP.
Comment by bourbonproof 6 days ago
Comment by saghm 6 days ago
Expanding on that a bit, mongo drivers even have a shared specification of the state machine for monitoring topology changes[1] and algorithm for selecting the server to send an operation to[2] (along with various declarative test cases that the drivers use to validate them alongside the specs in the repo). I think people sometimes underestimate how important the client-side work is to this sort of experience; for all of the faults mongo has had over the years, the amount of investment that they put into the client libraries is something I've never seen anywhere else (although having spent several years working on some of these libraries, my take is likely very biased).
[1]: https://github.com/mongodb/specifications/blob/master/source... [2]: https://github.com/mongodb/specifications/blob/master/source...
Comment by dzonga 6 days ago
its probably the easiest database to run at scale. run & forget. you just have to do a little more work on the data modeling part before you write your application i.e consider your query patterns.
Comment by BowBun 6 days ago
Don't pay a startup for your DB proxy, you should own that layer yourself inside of your infrastructure.
Comment by xyzzy_plugh 6 days ago
Comment by levkk 6 days ago
Comment by jmchuster 6 days ago
pg cat
pg dog
What's he going to name the next version?
pg emu ?
Comment by johnthescott 5 days ago
Comment by re-thc 6 days ago
Comment by BowBun 6 days ago
Comment by apsurd 6 days ago
In fact postgresML took naming heat because Postgres is right there in the name and they weren't affiliated with the brand. "pg" is just two letters. like WP-engine (literally the name as they say it is "double U P engine").
And a cat and a dog is fun.
don't think they're trying to get one over on you.
Comment by BowBun 5 days ago
Comment by aeyes 6 days ago
Unless you have millions of users, you don't really need this. It would be nice to have but its not a pressing need. So why invest into developing something that you only need once you are at massive scale? At this point you might as well switch away from Postgres because you'll surely have the manpower to do it.
Even with a proxy like PgDog the Postgres sharding story isn't solved. Resharding with logical replication is unlikely to work with databases which are already TBs in size. I never got it to catch up, I had to sync data at the filesystem level which is terrible. Tools like pg_repack also fall apart at scale.
For those that get to a point where a sharding proxy is required, switching databases is a very appealing solution.
And for those that are almost there, application side sharding is more flexible than building a query routing proxy.
Comment by chatmasta 6 days ago
Comment by dujuku 6 days ago
Comment by mamcx 6 days ago
This kind of tool will help in this case?
Comment by levkk 6 days ago
Comment by esafak 6 days ago
Comment by fulafel 6 days ago
Comment by levkk 6 days ago
Comment by danielheath 6 days ago
You _could_ make that ACID, but it's not going to be faster than a single machine.
Comment by gertburger 5 days ago
Comment by octernion 6 days ago
i'm sure you'll get 100x comments about "why not just have one fast SSD? it can do 2000 trillion writes/s"
Comment by levkk 6 days ago
Comment by snihalani 6 days ago
Comment by levkk 5 days ago
Comment by philippemnoel 6 days ago
Comment by bart3r 6 days ago
Would love to hear the advantages of moving to PgDog.
Comment by Wonnk13 6 days ago
Comment by andrey-g 6 days ago
Comment by hodgesrm 3 days ago
Comment by zadikian 6 days ago
Comment by levkk 6 days ago
Comment by melon_tsui 6 days ago
Comment by levkk 6 days ago
Comment by parthdesai 6 days ago
Comment by levkk 6 days ago
1. Let it crash. Increase the RAM, try again.
2. Page to disk (swap), make it slow but ultimately work.
Both have their trade-offs. There is no free lunch here.
Comment by SamInTheShell 6 days ago
Comment by xenophonf 6 days ago
https://github.com/pgdogdev/pgdog/commit/36434f93f03dec1d7d4...
I want to have as much fun as the next developer, but that makes me worry, what with supply chain attacks in the news and all.
Comment by rabidferret 6 days ago
- Sage
Comment by rabidferret 6 days ago
Comment by levkk 6 days ago
In all seriousness, we review every single line of code that goes in and only people who work for PgDog Inc are allowed to merge.
Comment by TurdF3rguson 6 days ago
Comment by redmonduser 6 days ago
Comment by Pet_Ant 6 days ago
Comment by levkk 6 days ago
Comment by dzonga 6 days ago
I don't know how the pg scaling story gets fixed unless certain things are rewritten. that's my fear of going all in pg.
mysql has vitess etc & even upgrades are easier. though pg is more extensible.
Comment by inigyou 5 days ago
However 95% of projects are going to be fine with a normal single-machine database and another 4% are going to be well served by upgrading the hell out of that machine. Only the absolute busiest projects actually need a distributed database and you can cross that bridge when you actually get to it.
They say Amazon processes 20k orders per second. That seems not unachievable for postgres with fast SSDs and careful query optimization, though they don't choose do it that way. You're not Amazon, you have at most 20 orders per second and that's nothing.
Comment by faangguyindia 6 days ago
Comment by rswail 6 days ago
This solves the thousands of clients case for read in a way that is transparent to the clients.
Yes it's required at large scale, especially if you want to distribute reads or shard to a particular geographical area.
Comment by sandeepkd 6 days ago
Surfacing where and how PG is better than Dynamo or any other database is probably a good starting point instead of calling out PG a silver bullet for everything. At the end of the day its all a trade-off.
Comment by levkk 6 days ago
Comment by rabidferret 6 days ago
Comment by 999900000999 6 days ago
Comment by pantulis 6 days ago
Comment by rswail 6 days ago
As long as they don't get undercut by the equivalent of AWS https://aws.amazon.com/rds/proxy/ which is a managed pgbouncer.
Comment by 999900000999 6 days ago
You’d need a ton of faith in these 3 people.
Feels more like it would work better inside of a bigger organization.
The QA tester in me is kinda risk adverse.
Comment by rswail 6 days ago
They rely on the libraries that are part of Postgres itself to ensure they are parsing the SQL etc "correctly" (where "correct" means "the same as Postgres itself).
Bigger organizations do not necessarily mean higher quality.
What bigger organization is testing PostgreSQL itself?
What are the relative quality measurements of Postgres vs MariaDB vs Oracle vs SQL Server?
Comment by 999900000999 5 days ago
https://mariadb.com/about-us/careers/
Oracle's department that handles DBs probably has at least a few hundred.
3 people would be an ultra lean QA department for a product like this.
I'd have a hard time convincing my boss to go with PGDog over a more stable and tested solution.
This doesn't mean it's bad, just not ready yet
Comment by codegeek 6 days ago
Comment by GHanku 5 days ago
Comment by antonvs 5 days ago
This just seems like fanboyism to me. At the very least, you need to qualify what scenarios you think it's useful for.
I don't doubt that Postgres is good for all the projects you've ever worked on. Generalizing from that, though, is hubristic.
Comment by skiwithuge 6 days ago
Comment by sgt 6 days ago
Comment by s3cur3n3t 5 days ago
Comment by gregaccount 6 days ago
Comment by orliesaurus 6 days ago
Comment by christoff12 6 days ago
Comment by advertum 5 days ago
Comment by mohammedelkarsh 5 days ago
Comment by edge_trader_41 3 days ago
Comment by sonixaep 5 days ago
Comment by RedMagicBox 6 days ago
Comment by exabrial 6 days ago
Not quite. The reason "DBs" like those exist is purely due to fashion. Lets not kid ourselves into thinking they do anything better, save the exception of making data hard to access, which might be a project goal in some cases.
Comment by inigyou 6 days ago