• dvrp an hour ago

    If you are interested in (2026-)internet scale data engineering challenges (e.g. 10-100s of petabyte processing) challenges and pre-training/mid-training/post-training scale challenges, please send me an email to d+data@krea.ai !

    • joshuaissac 3 hours ago
      • dang 3 hours ago

        Oh thanks! I've switched the top URL to that now. Submitted URL was https://github.com/datascale-ai/data_engineering_book.

        I hope xx123122 won't mind my mentioning that they emailed us about this post, which originally got caught in a spam filter. I invited them to post a comment giving the background to the project but they probably haven't seen my reply yet. Hopefully soon, given that the post struck a chord!

      • guillem_lefait an hour ago

        The figures in the different chapters are in english (it's not the case for the image in README_en.md).

        • undefined 3 hours ago
          [deleted]
          • rafavargascom 3 hours ago

            谢谢

            How is possible a Chinese publication gets to the top in HN?

            • rafavargascom 3 hours ago

              Nevermind.