![]() Timeit. Timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000) Timeit.timeit('nd = (nd)', setup=setup, number=1000) Otherwise, use np.random.shuffle() used code import timeit 3x faster np.random.shuffle(df.values)Ĭonclusion: If it is okay to axis info(index, column) to be shuffled along with ndarray, use (). dtype ( torch.dtype, optional) the desired data type of returned tensor. out ( Tensor, optional) the output tensor. Parameters: n ( int) the upper bound (exclusive) Keyword Arguments: generator ( torch.Generator, optional) a pseudorandom number generator for sampling. But WILL SHUFFLE the axis info(index, column) of the DataFrame along with the ndarray it contains.īetween () and np.random.shuffle(). (), as user tj89 suggested, can designate random_state along with another option to control output.In case you want to keep a copy of the original object, you have to do so before you pass to the function. Though, there are some points to consider. But the index of the DataFrame remains unshuffled. So if you use np.random.shuffle(), it would shuffle the array along the first axis of a multi-dimensional array. (You can check from DataFrame source code) So, in your case np.random.shuffle(DataFrame.values)ĭataFrame, under the hood, uses NumPy ndarray as a data holder. TL DR: np.random.shuffle(ndarray) can do the job. To show that this is indeed the case, you could run a simple memory profiler: $ python3 -m memory_profiler. That is, even though the reference object has changed (by which I mean id(df_old) is not the same as id(df_new)), the underlying C object is still the same. reset_index from creating a column containing the old index entries.įollow-up note: Although it may not look like the above operation is in-place, python/pandas is smart enough not to do another malloc for the shuffled object. ![]() df = df.sample(frac=1).reset_index(drop=True) If you wish to shuffle your dataframe in-place and reset the index, you could do e.g. ![]() The frac keyword argument specifies the fraction of rows to return in the random sample, so frac=1 means to return all rows (in random order). sample method of your data frame to sample all rows without replacement: df.sample(frac=1) The idiomatic way to do this with Pandas is to use the. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |