Long-context large language models (LLMs) have high training costs, hindering customized applications.A new training paradigm, Sequential Chunk-wise Optimization (SeCO), partitions inputs into manageable chunks for memory-efficient training.Sparse Chunk-wise Optimization (SpaCO) reduces computational overhead by selectively propagating gradients to specific chunks, enabling accelerated training.SeCO and SpaCO offer practical benefits by expanding the sequence length and improving training speed for long-context models.