Skip to content

Implement interleave()#405

Merged
cuviper merged 10 commits intorayon-rs:masterfrom
laumann:add-interleave
Sep 22, 2017
Merged

Implement interleave()#405
cuviper merged 10 commits intorayon-rs:masterfrom
laumann:add-interleave

Conversation

@laumann
Copy link
Copy Markdown

@laumann laumann commented Sep 3, 2017

For parity with itertools, interleave() should alternately produce
elements from two given iterators. Once one of the iterators run out,
elements should just be drawn from the other until both are exhausted.

This is from #384

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 11, 2017

I'm not sure why this fails, some of the errors are that the ? operator is not stable. Is there something I need to do?

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 11, 2017

It's not you, but futures update -- see #412.

At a glance, the PR looks great. I'll try to give it a proper review soon!

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 12, 2017

Ok, thank you :-) Let me know if there's anything I can do...

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 17, 2017

Can you now please rebase, to see how CI fares?

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Sure thing

Thomas Jespersen added 3 commits September 18, 2017 09:42
For parity with itertools, interleave() should alternately produce
elements from two given iterators. Once one of the iterators run out,
elements should just be drawn from the other until both are exhausted.

This is from #384
Remove unused import, remove commented test code and add more test cases
@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Sorry about the noise, I rebased on a different machine, and forgot to set core.name and core.email - should be fixed now.

Thomas Jespersen added 2 commits September 18, 2017 10:08
@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

@cuviper It should build now, I installed a rust 1.12.0 locally. The commitment to backwards compatibility is pretty cool, I think. 🤞 for tests passing

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 18, 2017

Looks like the order is getting mixed up:

---- iter::test::check_interleave_eq stdout ----
	thread 'iter::test::check_interleave_eq' panicked at 'assertion failed: `(left == right)`
  left: `[0, 10, 1, 2, 11, 12, 3, 13, 4, 14, 5, 15, 6, 7, 16, 17, 8, 18, 9, 19]`,
 right: `[0, 10, 1, 11, 2, 12, 3, 13, 4, 14, 5, 15, 6, 16, 7, 17, 8, 18, 9, 19]`', src/iter/test.rs:1735:4

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Yeah, I'm not sure why though - I can't get it to fail locally, and it seems to be only on nightly that it fails :-/

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 18, 2017

it seems to be only on nightly that it fails

nightly is the only one that runs tests, if only because the dev-dependency compiletest_rs requires it (and dev-deps can't be conditional).

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Hmm, that makes sense - am I doing something wrong then? I run cargo +nightly test, and I tried setting RUSTFLAGS='--cfg=rayon_unstable' as well, but so far no dice.

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 18, 2017

I think Travis runs with 2-cpus, so setting RAYON_NUM_THREADS=2 locally may get you closer. I suggest adding a temporary debug-print with all your index, i_idx, j_idx, i_split, j_split, flag, even etc. so you have a trace of how things go awry.

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Oh, that did it, thanks! I'll do some debugging now :-)

The flag indicating the next iterator to pull from was almost carried
through all the way, except for InterleaveSeq's constructor, which
always set it to false.
@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 18, 2017

Thanks for the help @cuviper! I had forgotten to carry the flag through to InterleaveSeq's constructor.

{
i: I,
j: J,
flag: bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the flag here? I think it's not used until we get to the Producer.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right

j: J,
i_len: usize,
j_len: usize,
flag: bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a more informative name for this flag, and a doc-comment explaining its purpose as well. Think of the poor developer who may have to revisit this code many months later... :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to think of a better name... names are hard :-)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open for suggestions on the name :-) Any of these you'd prefer?

  • toggle_flag
  • next_iter/next_producer

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's indicating which iterator will produce the next item, so maybe i_next or j_next? (depending on what you want true to mean)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with i_next.

self.flag = !self.flag;
if self.flag {
match self.i.next() {
None => self.j.next(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of assumes that i will be fused (continue returning None), as the flag flips back and forth trying this iterator again. We may want a done: bool to remember and stop flipping the flag, or I guess len() can tell us this too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the itertools version is fused. I'm not really sure of the best way to handle it, I would just add a done bool I think

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's literally fused! You could just use that too. Fuse has a done bool itself, but it also specializes on the unstable FusedIterator trait to skip that check for naturally-fused iterators.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added Fuse to both iterators in InterleaveSeq

pub struct InterleaveSeq<I, J> {
i: I,
j: J,
flag: bool
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, better name and comments please.


fn size_hint(&self) -> (usize, Option<usize>) {
let (ih, jh) = (self.i.size_hint(), self.j.size_hint());
let min = ih.0.checked_add(jh.0).unwrap_or(usize::MAX);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps saturating_add instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
#[inline]
fn next_back(&mut self) -> Option<I::Item> {
if self.i.len() <= self.j.len() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if they are equal length, then it needs to take the flag into account.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, done.

src/iter/mod.rs Outdated
zip::new(self, zip_op.into_par_iter())
}

/// Interleave elements of this iterator and the other given iterator.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your PR is more descriptive than these docs -- please elaborate! An example would be nice too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborated and added example.

xs.par_iter().interleave(&ys).map(|&i| i).collect_into(&mut res);
assert_eq!(expected, res, "Case {} failed", i+1);
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a test under rev(), for the next_back behavior?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another assert_eq that tests each case when using rev()

Thomas Jespersen added 3 commits September 20, 2017 22:31
Documentation in a few places, add tests with `rev()` to exercise
`next_back()` and fix the `next_back()` implementation.
 - Remove ExactSizeIterator from Iterator impl for InterleaveSeq
 - Use `saturating_add()` instead of `checked_add().unwrap_or(MAX)`
 - Rename `flag` to `i_next`
 - Use `Fuse` for the iterators embedded in `InterleaveSeq`
@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 22, 2017

Let's merge -- thanks!

@cuviper cuviper merged commit 2cf5ac2 into rayon-rs:master Sep 22, 2017
@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 22, 2017

Thanks! It should be relatively easy to implement interleave_shortest, I think, so I could give that a shot if you're interested.

@laumann laumann deleted the add-interleave branch September 22, 2017 10:05
@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 22, 2017

Yeah, I think interleave_shortest will be a simple wrapper, much like zip_eq and zip. Go for it!

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 22, 2017

Hmm, it may not be as trivial as I thought - the zip_eq is simple, because it is just zip with an added assertion in front. interleave_shortest should terminate once one of the iterators runs out (interleave exhausts them both).

But by their example, it is not simply take()ing the first n elements where n = cmp::min(i.len(), j.len()):

let it = (1..7).interleave_shortest(vec![-1, -2]);
itertools::assert_equal(it, vec![1, -1, 2, -2, 3]);

(source)

Observe that the last element is drawn from the first iterator, ie three elements are drawn from that one while the second only provides two elements.

I can see how I'd change with_producer for Interleave to select the right lengths for creating the producers, but I'd like to avoid copying too much code if possible. Maybe I'll try to pull that code out in interleave.rs and implement InterleaveShortest in interleave.rs? But I'm for better suggestions on how to implement this :-)

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 22, 2017

Something like this?

if i.len() <= j.len() {
    // take equal lengths from both iterators
    let n = i.len();
    i.take(n).interleave(j.take(n))
} else {
    // take one extra item from the first iterator
    let n = j.len();
    i.take(n + 1).interleave(j.take(n))
}

(using take even when it's redundant, to get consistent types)

@laumann
Copy link
Copy Markdown
Author

laumann commented Sep 22, 2017

Thanks for the suggestion! I was trying something similar, but run into:

error[E0308]: mismatched types
  --> src/iter/interleave.rs:49:9
   |
42 | pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>
   |                                          ---------------- expected `iter::interleave::Interleave<I, J>` because of return type
...
49 |         i.take(n).interleave(j.take(n))
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter, found struct `iter::take::Take`
   |
   = note: expected type `iter::interleave::Interleave<I, J>`
              found type `iter::interleave::Interleave<iter::take::Take<I>, iter::take::Take<J>>`

error[E0308]: mismatched types
  --> src/iter/interleave.rs:53:9
   |
42 | pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>
   |                                          ---------------- expected `iter::interleave::Interleave<I, J>` because of return type
...
53 |         i.take(n + 1).interleave(j.take(n))
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter, found struct `iter::take::Take`
   |
   = note: expected type `iter::interleave::Interleave<I, J>`
              found type `iter::interleave::Interleave<iter::take::Take<I>, iter::take::Take<J>>`

@cuviper
Copy link
Copy Markdown
Member

cuviper commented Sep 22, 2017

pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>

I think we'll want this to return a distinct InterleaveShortest<I, J> type, so we're not bound to any particular implementation details. This custom type can contain an Interleave<Take<I>, Take<J>>, and forward all methods to that. (as ZipEq does)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants