google · SwordYork · Apr 19, 2017 · Apr 19, 2017 · Apr 19, 2017
diff --git a/docs/models.md b/docs/models.md
@@ -7,7 +7,7 @@ This is an abstract class that cannot be used as a model during training. Other
 
 | Name | Default | Description |
 | --- | --- | --- |
-| `optimizer.name` | `Adam` | Type of Optimizer to use, e.g. `Adam`, `SGD` or `Momentum`. The name is fed to TensorFlow's [optimize_loss](https://www.tensorflow.org/api_docs/python/contrib.layers/optimization#optimize_loss) function. See TensorFlow documentation for more details and all available options. |
+| `optimizer.name` | `Adam` | Type of Optimizer to use, e.g. `Adam`, `SGD` or `Momentum`. The name is fed to TensorFlow's [optimize_loss](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/optimize_loss) function. See TensorFlow documentation for more details and all available options. |
 | `optimizer.learning_rate` | `1e-4` | Initial learning rate for the optimizer. This is fed to TensorFlow's [optimize_loss](https://www.tensorflow.org/api_docs/python/contrib.layers/optimization#optimize_loss) function. |
 | `optimizer.lr_decay_type` |  | The name of one of TensorFlow's [learning rate decay functions](https://www.tensorflow.org/api_docs/python/#training--decaying-the-learning-rate) defined in `tf.train`, e.g. `exponential_decay`. If this is an empty string (default) then no learning rate decay is used. |
 | `optimizer.lr_decay_steps` | `100` | How often to apply decay. This is fed as the `decay_steps` argument to the decay function defined above. See Tensoflow documentation for more details. |

diff --git a/docs/training.md b/docs/training.md
@@ -39,7 +39,7 @@ python -m bin.train --config_paths config.yml
 
 Multiple configuration files are merged recursively, in the order they are passed. This means you can have separate configuration files for model hyperparameters, input data, and training options, and mix and match as needed.
 
-For a concrete examples of configuration files, refer to the [example configurations](https://github.com/google/seq2seq/tree/master/example_configs) and [Neural Machine Translation Tutorial](NMT/).
+For a concrete examples of configuration files, refer to the [example configurations](https://github.com/google/seq2seq/tree/master/example_configs) and [Neural Machine Translation Tutorial](nmt/).
 
 
 ## Monitoring Training

diff --git a/seq2seq/data/split_tokens_decoder.py b/seq2seq/data/split_tokens_decoder.py
@@ -51,6 +51,11 @@ def decode(self, data, items):
     decoded_items = {}
 
     # Split tokens
+    if self.delimiter == "":
+      # Improper use of tf.string_split, for example, it will split
+      # '\xc2\xa3' into ['\xc2', '\xa3'].
+      # see issue #153
+      tf.logging.warning("Take care of unicode strings.")
     tokens = tf.string_split([data], delimiter=self.delimiter).values
 
     # Optionally prepend a special token

diff --git a/seq2seq/tasks/decode_text.py b/seq2seq/tasks/decode_text.py
@@ -23,13 +23,20 @@
 import functools
 from pydoc import locate
 
+import sys
 import numpy as np
 
 import tensorflow as tf
 from tensorflow import gfile
 
 from seq2seq.tasks.inference_task import InferenceTask, unbatch_dict
 
+# wrap stdout when using python 2
+if sys.version_info < (3, 0):
+  import codecs
+  if hasattr(sys.stdout, 'encoding'):
+    sys.stdout = codecs.getwriter(sys.stdout.encoding or "UTF-8")(sys.stdout)
+
 
 def _get_prediction_length(predictions_dict):
   """Returns the length of the prediction based on the index