Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic simple example of streaming a large JSON file #137

Open
dandv opened this issue May 24, 2023 · 5 comments
Open

Add basic simple example of streaming a large JSON file #137

dandv opened this issue May 24, 2023 · 5 comments
Assignees

Comments

@dandv
Copy link

dandv commented May 24, 2023

The README for this module is a bit daunting... lots of people are looking simply to parse a large JSON file and get objects one by one:

It would be really helpful to have a simple basic canonical example of how to parse a large JSON file that consists of an arbitrary number of objects in an array. Ideally, that would be a generator pattern, so streaming can be aborted at will, and items iterated through with a for loop like with Python's ijson library:

with open('large_file.json', 'rb') as f:
    objects = ijson.items(f, 'item')
    for o in objects:
        add_object(o)
@uhop uhop self-assigned this May 25, 2023
@dandv
Copy link
Author

dandv commented May 25, 2023

@uhop: I've added an example at the top of the most upvoted SO answer for the package. What do you think?

@uhop
Copy link
Owner

uhop commented May 25, 2023

It looks good. I would suggest to describe what kind of input is expected. It makes the code more transparent. I plan to put a similar example (or the same one) at the top of README.

PS: make sure that the example actually runs. Non-working examples damage credibility.

@dandv
Copy link
Author

dandv commented May 25, 2023

It runs fine on an array of objects. I used it in this demo for work.

@RichardJECooke
Copy link

I tried copying this simple example but it just outputs "undefined" for each JSON element. Why please?

import parser from 'stream-json';
import StreamArray from 'stream-json/streamers/StreamArray.js';
import Chain from 'stream-chain';
import * as fs from "fs";

importUsers().catch(error => console.error(error));

async function importUsers() {
    const users = new Chain([
      fs.createReadStream('users.json'),
      parser(),
      new StreamArray(),
    ]);
    for await (const { user } of users) {
      console.log(user);
    }
}

/*
users.json:

[
  {
    "name": "a"
  },
  {
    "name": "b"
  },
  {
    "name": ""
  }
]

*/

@RichardJECooke
Copy link

Never mind, GPT fixed: for await (const { value:user } of users) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants