Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
Lists list list I love lists, just about everything can be a list of something. There are a thousand ways to solve problems with lists. Letâs talk about de-duping a long list and then cutting up that list into batches.The problem encountered was a list of some 60,000 IDs and in that list there were 3 dupes. Then after getting a clean list, that list needed to be batched up with 5000 IDs per batch.
The initial idea was to build iterating loops to cycle through the list to find the dupes, using a recursive loop in a loop. This was going to be tedious and perhaps 15 lines of code, pretty standard stuff. To the google, and of course, ended up on stack overflow. There is a python functionality called âsetâ.
8.7. sets - Unordered collections of unique elements - Python 2.7.15 documentation
Use some python built in functionality and the problem solves itself.
Next up, time to create batched lists of a single list. Again something pretty simple but can be made even easier using a single yield and len() built in. Whats yield?
The Python yield keyword explained
Yield is a keyword that is used like return, except the function will return a generator.What is a generator?Generators are iterators, but you can only iterate over them once. Itâs because they do not store all the values in memory, they generate the values on the fly:
The way that we will use this is to create a separate def for the chunking and then pass the def the list and then length of the batch. The def will then run the for/yield until it runs out of data. WHAT? Hereâs an example.
The first time the for calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield, then itâll return the first value of the loop. Then, each other call will run the loop you have written in the function one more time, and return the next value, until there is no value to return.
What this ends up doing it returning a list of lists containing all the values of the original list broken up into batches. There you have it, quick and dirty and much less painful than iterate loops done the hard way. Let python do python things and we can leverage the power of the language to chew up data.
Lists, while they are a simple concept the simplest concepts can be the most powerful. It is always important to understand the hard way of doing things and then once that is accomplished learn the quickest and easiest way possible. However when the shortcut breaks, or doesn't work for an implementation, it is important to know how it works so it can be fixed, or reworked another way.
Python: list de-duping, list of lists batches was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.