Syllable
========
Version 1.4.5

[![Build Status](https://travis-ci.org/vanderlee/phpSyllable.svg)](https://travis-ci.org/vanderlee/phpSyllable)

Copyright &copy; 2011-2016 Martijn van der Lee.
MIT Open Source license applies.

Introduction
------------
PHP Syllable splitting and hyphenation.
or rather...
PHP Syl-la-ble split-ting and hy-phen-ation.

Based on the work by Frank M. Liang (http://www.tug.org/docs/liang/)
and the many volunteers in the TeX community.

Many languages supported. i.e. english (us/uk), spanish, german, french, dutch,
italian, romanian, russian, etc. 76 languages in total.

Language sources: http://tug.org/tex-hyphen/#languages

Supports PHP 5.2 and up, so you can use it on older servers.

Quick start
-----------
Just include phpSyllable in your project, set up the autoloader to the classes
directory and instantiate yourself a Sylllable class.

	$syllable = new Syllable('en-us');
	echo $syllable->hyphenateText('Provide a plethora of paragraphs');

`Syllable` class reference
--------------------------
The following is an incomplete list, containing only the most common methods.
For a complete documentation of all classes, read the generated [PHPDoc](doc).

### public static __construct(  $language = 'en',  $hyphen = null )
Create a new Syllable class, with defaults

### public static setCacheDir(  $dir )
Set the directory where compiled language files may be stored.
Default to the `cache` subdirectory of the current directory.

### public static setLanguageDir(  $dir )
Set the directory where language source files can be found.
Default to the `languages` subdirectory of the current directory.

### public setLanguage(  $language )
Set the language whose rules will be used for hyphenation.

### public setHyphen( Mixed $hyphen )
Set the hyphen text or object to use as a hyphen marker.

### public array splitWord(  $word )
Split a single word on where the hyphenation would go.

### public array splitText(  $text )
Split a text on where the hyphenation would go.

### public string hyphenateWord(  $word )
Hyphenate a single word.

### public string hyphenateText(  $text )
Hyphenate all words in the plain text.

### public string hyphenateHtml(  $html )
Hyphenate all readable text in the HTML, excluding HTML tags and attributes.

### public array histogramText(  $text )
Count the number of syllables in the text and return a map with
syllable count as key and number of words for that syllable count as
the value.

### public integer countWordsText(  $text )
Count the number of words in the text.

### public integer countPolysyllablesText(  $text )
Count the number of polysyllables in the text.

Example
-------
See the included [demo.php](demo.php) file for a working example.

	// Setup the autoloader (if needed)
	require_once dirname(__FILE__) . '/classes/autoloader.php';

	// Create a new instance for the language
	$syllable = new Syllable('en-us');

	// Set the directory where the .tex files are stored
	$syllable->getSource()->setPath(__DIR__ . '/languages');

	// Set the directory where Syllable can store cache files
	$syllable->getCache()->setPath(__DIR__ . '/cache');

	// Set the hyphen style. In this case, the &shy; HTML entity
	// for HTML (falls back to '-' for text)
	$syllable->setHyphen(new Syllable_Hyphen_Soft);

	// Set the treshold (sensitivity)
	$syllable->setTreshold(Syllable::TRESHOLD_MOST);

	// Output hyphenated text
	echo $syllable->hyphenateText('Provide your own paragraphs...');

Changes
-------
1.4.4
-	Composer autoloader added

1.4.3
-	Improved documentation

1.4.2
-	Updated spanish language files.
-	Initial PHPDoc.

1.4.1
-	More fixes for apostrophes in splitting.

1.4
-	Fix for French language handling
-	Refactor .text loading into source class.
-	Massive cache performance increase (excessive writes).

1.3.1
-	Fix slow initial cache writing; too many writes (only one was needed).
-	Removed min_hyphenation; mb_strlen takes more time than hashmap lookup.

1.3
-	Added `array histogramText($text)`, `integer countWordsText($text)` and
	`integer countPolysyllableText($text)` methods.
-	Refactored cache interface.
-	Improved unittests.

1.2
-	Deprecated treshold feature. Was based on misinterpretation of the
	algorithm. Methods, constants and constructor signature unchanged, although
	you can now omit the treshold if you want (or leave it in, it's detected as
	a "fake" treshold).