1 comments

  • Nevermark 51 minutes ago
    I think we have barely scratched the surface of post-trained inference/generative model inference efficiency.

    A uniquely efficient hardware stack, for either training or inference, would be a great moat in an industry that seems to offer few moats.

    I keep waiting to here of more adoption of Cerebras Systems' wafer-scale chips. They may be held back by not offering the full hardware stack, i.e. their own data centers optimized around wafer-scale compute units. (They do partner with AWS, as a third party provider, in competition with AWS own silicon.)