Rightmove data parsing php class

Rightmove.co.uk provides a facility to it’s agents for bulk upload properties on their site. Agent’s create a blm file, attach media files, compress these as a zip file and upload to a ftp folder specified by rightmove.co.uk. This is(rightmove) now a standard around europe for realty sites. I’ve built a php class that parse rightmove data files, move attached media to folders, return processed data as array.
Few important notes,

1 . This class deals with zip file, so you’ve to upload zip file, if not so – you have to modify class as you need.
2. You must have to define the blm file properly. BLM file must contains header properly such as “field value separator”, “line separator”, “total properties” etc.

Here is the class file –


This is data parsing class

<?php
	//rightmove.class.php

	class RightmoveParser{

		public $folder;  //folder to scan
		public $rmfile;	 //source zip file
		public $temp_loc; //temporary file location
		public $image_loc; //property images file location
		public $archive_loc; //location to archive
		public $keep_source_file = true;
		public $validate_fields = false;
		public $validate_values = false;
		public $scan_folder = true;

		//property value with status
		public $STATUS_ID = array(	"Available",
									"SSTC (Sales only)",
									"SSTCM(Scottish Sales only)",
									"Under Offer (Sales only)",
									"Reserved (Lettings only)",
									"Let Agreed (Lettings only)"
								);
		public $PRICE_QUALIFIER		= array(
											0 => "Default",
											1 => "POA",
											2 => "Guide Price",
											3 => "Fixed Price",
											4 => "Offers in Excess of",
											5 => "OIRO",
											6 => "Sale by Tender",
											7 => "From",
											9 => "Shared Ownership",
											10 => "Offers Over",
											11 => "Part Buy Part Rent",
											12 => "Shared Equity"
										);
		public $PUBLISHED_FLAG		= array(0 => "Hidden/invisible", 1 => "Visible");
		public $LET_TYPE_ID			= array(0=>"Not Specified", 1=>"Long Term", 2=>"Short Term", 3=>"Student", 4=>"Commercial");
		public $LET_FURN_ID			= array(0 => "Furnished", 1 => "Part Furnished", 2 => "Unfurnished", 3 => "Not Specified", 4=>"Furnished/Un Furnished");
		public $LET_RENT_FREQUENCY	= array(0 => "Weekly", 1 => "Monthly", 2 => "Quarterly", 3 => "Annual");
		public $TENURE_TYPE_ID		= array(1 => "Freehold", 2 => "Leasehold", 3 => "Feudal", 4 => "Commonhold", 5 => "Share of Freehold");
		public $TRANS_TYPE_ID		= array(1 => "Resale", 2=> "Lettings");
		public $NEW_HOME_FLAG		= array("Y" => "New Home", "N" => "Non New Home");
		public $PROP_SUB_ID			= array(
											0=>"Not Specified",
											1=>"Terraced",
											2=>"End of Terrace",
											3=>"Semi-Detached",
											4=>"Detached",
											5=>"Mews",
											6=>"Cluster House",
											7=>"Ground Flat",
											8=>"Flat",
											9=>"Studio",
											10=>"Ground Maisonette",
											11=>"Maisonette",
											12=>"Bungalow",
											13=>"Terraced Bungalow",
											14=>"Semi-Detached Bungalow",
											15=>"Detached Bungalow",
											16=>"Mobile Home",
											17=>"Hotel",
											18=>"Guest House",
											19=>"Commercial Property",
											20=>"Land",
											21=>"Link Detached House",
											22=>"Town House",
											23=>"Cottage",
											24=>"Chalet",
											27=>"Villa",
											28=>"Apartment",
											29=>"Penthouse",
											30=>"Finca",
											43=>"Barn Conversion",
											44=>"Serviced Apartments",
											45=>"Parking",
											46=>"Sheltered Housing",
											47=>"Retirement Property",
											48=>"House Share",
											49=>"Flat Share",
											50=>"Park Home",
											51=>"Garages",
											52=>"Farm House",
											53=>"Equestrian",
											56=>"Duplex",
											59=>"Triplex",
											62=>"Longere",
											65=>"Gite",
											68=>"Barn",
											71=>"Trulli",
											74=>"Mill",
											77=>"Ruins",
											80=>"Restaurant",
											83=>"Cafe",
											86=>"Mill",
											89=>"Trulli",
											92=>"Castle",
											95=>"Village House",
											101=>"Cave House",
											104=>"Cortijo",
											107=>"Farm Land",
											110=>"Plot",
											113=>"Country House",
											116=>"Stone House",
											117=>"Caravan",
											118=>"Lodge",
											119=>"Log Cabin",
											120=>"Manor House",
											121=>"Stately Home",
											125=>"Off-Plan",
											128=>"Semi-detached Villa",
											131=>"Detached Villa",
											134=>"Bar",
											137=>"Shop",
											140=>"Riad",
											141=>"House Boat",
											142=>"Hotel Room",
											);

		private $document_files = array();	//document files(.blm file) inside zip folder
		private $media_files = array();		//media files such as jpg, gif, pd
		private $doc_header = array('line_separator'=>'EOR', 'field_separator'=>'EOF', 'total'=>0);
		private $properties = array();		//property data
		private $pe_properties = array();


		/* Note:

			MEDIA_DOCUMENT_50, EDIA_DOCUMENT_TEXT_50  - MEDIA_DOCUMENT_59, MEDIA_DOCUMENT_TEXT_59 are HIP/EPC values
			MEDIA_IMAGE_60, MEDIA_IMAGE_TEXT_60	- EDIA_IMAGE_61 & MEDIA_IMAGE_TEXT_61 for EPC Graph
		*/

		//parse and return data
		public function getPropertyData(){

				try{

					if($this->scan_folder && !isset($this->rmfile)){

						//scan the folder first
						$newData = $this->getZipFile();

						//If no new data, terminate
						if(!$newData)
						{
							throw new Exception('<h3>No zip file inside folder</h3>');
						}

					}	

					$this->unzipFiles();	//unzip files
					$this->parseDocs();		//now parse the document files
					$this->removeTempFiles();	//remove temporary files
					$this->archiveZipFile();	//archive the processed file

					if(count($this->pe_properties)<1)
						throw new Exception('<h3>No Data Found</h3>');

					return $this->pe_properties;

				}catch(Exception $e){
					echo $e->getMessage();
				}
		}

		//scan a folder for agent's zip file
		private function getZipFile(){
			try{
				if ($handle = opendir($this->folder)) {

				    /* This is the correct way to loop over the directory. */
				    while (false !== ($file = readdir($handle))) {
				    	
						if ($file != "." && $file != ".." && substr($file,0,1)!='.' ) {

							if(strstr($file,".zip")){
					        	$this->rmfile = $file;
								break;		//get only one zip file per folder
							}

						}
				    }
				    closedir($handle);
					return true;
				}else return false;
			}catch(Exception $e){
				echo $e->getMessage();
			}

		}

		//unzip file and copy to related
		private function unzipFiles(){
			
			try{
			
				//unzip and move file to temporary locations
				$zip = new zipfile;
	
				$data = $zip->read_zip($this->folder.$this->rmfile);
				
				if($data[0]['data']=='' OR !isset($data) )
					throw new Exception('Failed to read zip file');

				
				$docidx = 0;
				$mediaidx = 0;
	
				foreach($data as $idx=>$fileinfo){
						//copy files to temporary folder
					    $handle = fopen($this->temp_loc.$fileinfo['name'], 'w');
					    fwrite($handle, $fileinfo['data']);
					    fclose($handle);
	
						//check if it is a document file or other media file
	//					if (fnmatch("*.blm", $fileinfo['name'])) {
						if (strstr($fileinfo['name'], ".blm")) {
							$this->document_files[$docidx]['branchid'] = basename($this->rmfile, ".zip");
							$this->document_files[$docidx]['file_path']= $this->temp_loc.$fileinfo['name'];
							$this->document_files[$docidx++]['basename']= $fileinfo['name'];
						}else{
							$this->media_files[$mediaidx]['file_path']= $this->temp_loc.$fileinfo['name'];
							$this->media_files[$mediaidx++]['basename']= $fileinfo['name'];
						}
	
				}
				
				}catch(Exception $e){
				 echo $e->getMessage();
				 exit(1);
				}

		//	print_r($this->document_files);
		}

		//Now parse the blm file
		private function parseDocs(){
			try {
				foreach($this->document_files as $doc){
				
					//read the document file
					$fp = fopen($doc['file_path'],'r');
					$data = fread($fp, filesize($doc['file_path']));
					fclose($fp);
					

					//split the content to header, definition, data
					$header = substr($data,strpos($data,'#HEADER#')+8,strpos($data,'#DEFINITION#')-8);	//document header

					//process header data
					$header_data = explode("\n",$header);
					$header_data = array_filter($header_data,array($this,"cleanArray"));
					
					

					foreach($header_data as $hdata){

						//field value separator
						if(strstr($hdata,"EOF")){
							$replace_chars = array("EOF"," ",":","'","\n","\r");
							$this->doc_header['field_separator'] = str_replace($replace_chars,"",$hdata);
						}

						//line separator
						if(strstr($hdata,"EOR")){
							$replace_chars = array("EOR"," ",":","'","\n","\r");
							$this->doc_header['line_separator'] = str_replace($replace_chars,"",$hdata);
						}

						//total properties
						if(strstr($hdata,"Property Count")){
							$replace_chars = array("Property Count"," ",":","\n","\r");
							$this->doc_header['total'] = (int) str_replace($replace_chars,"",$hdata);
						}
					}
					//end of processing header data
/*	!bookmark */
					//process definition
					$definition_length = strpos($data, $this->doc_header['line_separator'], strpos($data,'#DEFINITION#') )-strpos($data,'#DEFINITION#')-12;
					$definition = substr($data, strpos($data,'#DEFINITION#')+12, $definition_length);	//field's details
					$definition = trim($definition);
					$this->doc_definition = explode($this->doc_header['field_separator'],$definition);
					$this->doc_definition = array_filter($this->doc_definition,array($this,"cleanArray"));
					//end of processing definition
					/* temp commented
					$this->checkMandatoyFields();	//check if document has mandatory fields
					*/

					$content_lenghth = strpos($data, '#END#' )-strpos($data,'#DATA#')-6;
					$content =  substr($data,strpos($data,'#DATA#')+6, $content_lenghth);	//field's details
					$content_data = explode($this->doc_header['line_separator'],$content);
					$content_data = array_filter($content_data,array($this,"cleanArray"));

					array_walk($content_data, array($this, 'trimArray'));	//trim the lines			

					//if total properties and number of properties defined in data is not same, throw error
					if((count($content_data)!=$this->doc_header['total']) && $this->doc_header['total']>0 ){
						throw new Exception('<p>Total number of properties in header and total properties defined in data is not same.<br />Defined header total# '.$this->doc_header['total'].'<br />Number of properties in dataset# '.count($content_data).'</p>');
					}else{
						$this->doc_header['total'] = count($content_data);
					}
//					print_r($this->doc_definition);

					foreach($content_data as $key1=>$property_data){
						$property_data = substr($property_data,0,-1); //exclude last field separator
						$raw_data = explode($this->doc_header['field_separator'],$property_data);

						//if total fields defined in definition and fields defined in data is not same, throw error
						if(count($this->doc_definition)!==count($raw_data)){
							throw new Exception('<p>Total fields defined in definition and fields defined in data is not same</p>');
						}

						foreach($raw_data as $key2=>$property){

							$this->checkMandatoyValues($key1, $this->doc_definition[$key2],$property);	//check value for mandatory value fields, throw exception else

							//escape output
							if($property!='')
								$property = $this->formatValue($this->doc_definition[$key2], $property);

							//copy the media proper location
							if(preg_match("/MEDIA_IMAGE_[0-9]{2}/",$this->doc_definition[$key2])){
								if($property!=''){
									if(file_exists($this->temp_loc.$property)){

										$path_parts = pathinfo($this->temp_loc.$property);
										if(is_numeric(array_search(strtolower($path_parts['extension']),array('jpg','gif','pd'))))
								 			@copy($this->temp_loc.$property,$this->image_loc.$property);
										else
										throw new Exception("Unsupported file ".$property);

									}
									/*
									else
										throw new Exception("File '".$property."' doesn't exist in zip.");*/

								}
							}	

							$this->properties[$key1][$this->doc_definition[$key2]] = $property;
							$this->pe_properties[$key1][$this->doc_definition[$key2]] = $property;

							//put few text values for status code from manual
							if(strstr($this->doc_definition[$key2],"STATUS_ID") && $property!=''){
								$this->pe_properties[$key1]['STATUS_ID_TEXT'] = $this->STATUS_ID[$property];
							}

							if(strstr($this->doc_definition[$key2],"PRICE_QUALIFIER") && $property!=''){
								$this->pe_properties[$key1]['PRICE_QUALIFIER_TEXT'] = $this->PRICE_QUALIFIER[$property];
							}							

							if(strstr($this->doc_definition[$key2],"PUBLISHED_FLAG") && $property!=''){
								$this->pe_properties[$key1]['PUBLISHED_FLAG_TEXT'] = $this->PUBLISHED_FLAG[$property];
							}

							if(strstr($this->doc_definition[$key2],"LET_TYPE_ID") && $property!=''){
								$this->pe_properties[$key1]['LET_TYPE_ID_TEXT'] = $this->LET_TYPE_ID[$property];
							}							

							if(strstr($this->doc_definition[$key2],"LET_FURN_ID") && $property!=''){
								$this->pe_properties[$key1]['LET_FURN_ID_TEXT'] = $this->LET_FURN_ID[$property];
							}

							if(strstr($this->doc_definition[$key2],"LET_RENT_FREQUENCY") && $property!=''){
								$this->pe_properties[$key1]['LET_RENT_FREQUENCY_TEXT'] = $this->LET_RENT_FREQUENCY[$property];
							}							

							if(strstr($this->doc_definition[$key2],"TENURE_TYPE_ID") && $property!=''){
								$this->pe_properties[$key1]['TENURE_TYPE_ID_TEXT'] = $this->TENURE_TYPE_ID[$property];
							}

							if(strstr($this->doc_definition[$key2],"TRANS_TYPE_ID") && $property!=''){
								$this->pe_properties[$key1]['TRANS_TYPE_ID_TEXT'] = $this->TRANS_TYPE_ID[$property];
							}							

							if(strstr($this->doc_definition[$key2],"NEW_HOME_FLAG") && $property!=''){
								$this->pe_properties[$key1]['NEW_HOME_FLAG_TEXT'] = $this->NEW_HOME_FLAG[$property];
							}
							//end of put few text values for status code from manual

						}

					}
//					print_r($this->pe_properties);

				}
		  	}catch(Exception $e){
				echo $e->getMessage();
			}
		}

		//function to archive a file, remove the source file too
		private function archiveZipFile(){
			@unlink($this->archive_loc.$this->rmfile);		//remove if there is already a same named file
			@copy($this->folder.$this->rmfile, $this->archive_loc.$this->rmfile);

			//delete the source file if needed
			if(!$this->keep_source_file){
				@unlink($this->folder.$this->rmfile);
			}

		}

		//remove files from temporary folder
		private function removeTempFiles(){
			//remove the media files
			foreach($this->media_files as $media_file){
				@unlink($media_file['file_path']);
			}
			//remove the document files
			foreach($this->document_files as $document_file){
				@unlink($document_file['file_path']);
			}	

		}

		//check if document contains mandatory fields
		private function checkMandatoyFields(){
			//if we need validation
			if($this->validate_fields){

				$fields = array("AGENT_REF","ADDRESS_1","ADDRESS_2","TOWN","POSTCODE1","POSTCODE2","FEATURE1","FEATURE2","FEATURE3","FEATURE4","FEATURE5","SUMMARY","DESCRIPTION","BRANCH_ID","STATUS_ID","BEDROOMS","PRICE","PRICE_QUALIFIER","PROP_SUB_ID","CREATE_DATE","UPDATE_DATE","DISPLAY_ADDRESS","PUBLISHED_FLAG","LET_DATE_AVAILABLE","LET_BOND","LET_TYPE_ID","LET_FURN_ID","LET_RENT_FREQUENCY","TENURE_TYPE_ID","TRANS_TYPE_ID","NEW_HOME_FLAG", "MEDIA_IMAGE_00", "MEDIA_IMAGE_60","MEDIA_IMAGE_TEXT_60","MEDIA_DOCUMENT_50","MEDIA_DOCUMENT_TEXT_50");
				try{
					$absent_fields = array_diff($fields,$this->doc_definition);
	//				print_r($absent_fields);
					if(count($absent_fields)>0){
						$msg = "<h3>You document missing these fields - ".implode(", ",$absent_fields)."</h3>";
						throw new Exception($msg);
					}

				}catch(Exception $e){
					echo $e->getMessage();
				}
			}
		}

		//check mandatory fields
		private function checkMandatoyValues($line,$field,$value){
			//if we need validation
			if($this->validate_values){

				$field_for_values = array("AGENT_REF","ADDRESS_1","ADDRESS_2","TOWN","POSTCODE1","POSTCODE2","FEATURE1","FEATURE2","FEATURE3","SUMMARY","DESCRIPTION","BRANCH_ID","STATUS_ID","BEDROOMS","PRICE","PROP_SUB_ID","DISPLAY_ADDRESS","PUBLISHED_FLAG","TRANS_TYPE_ID", "MEDIA_IMAGE_00");
				try{
					if(is_numeric(array_search($field,$field_for_values)) && $value==''){
						$msg = "<h3> Line #".$line." - ".$field." can not be empty</h3>\n";
						throw new Exception($msg);
					}
				}catch(Exception $e){
					echo $e->getMessage();
				}
			}		

		}

		//filter array, ommit empty value
		private function cleanArray($var){
			$replace_chars = array(" ","\n","\r","\t");
			$var = trim(str_replace($replace_chars,"",$var));
			return (isset($var) && $var!='');
		}

		//trim an array
		private function trimArray(&$var){
			$var = trim($var);
		}

		//escape output
		private function formatValue($field, $value){
			$string_fields = array("AGENT_REF","ADDRESS_1","ADDRESS_2","TOWN","POSTCODE1","POSTCODE2","FEATURE1","FEATURE2","FEATURE3","SUMMARY","DESCRIPTION","DISPLAY_ADDRESS","NEW_HOME_FLAG", "MEDIA_IMAGE_TEXT_00");
			$number_fields = array("BRANCH_ID","STATUS_ID","BEDROOMS","PRICE","PRICE_QUALIFIER","PROP_SUB_ID","PUBLISHED_FLAG","LET_BOND","LET_TYPE_ID","LET_FURN_ID","LET_RENT_FREQUENCY","TENURE_TYPE_ID","TRANS_TYPE_ID");	

			if(is_numeric(array_search($field,$string_fields)) && $value==''){
				$value = (string) $value;
				$value = strip_tags($value, '<p><u><strong><i><b>');
			}

			if(is_numeric(array_search($field,$number_fields)) && $value==''){

				if(is_numeric(array_search($field,array("LET_BOND","PRICE") )))
					$value = (float) $value;
				else
					$value = (int) $value;
			}

			return $value;
		}		

	}

?>

Here is the zip class that i used to process compress file –

<?php
/**************************************************************************************************
 *  File Defination 
 *  - Zip compression
 -------------------------------------------------------------------
    ABOUT THIS  3RD PARTY LIB.
 --------------------------------------------------------------------
 *  Downlaoed from : www.phpclasses.org
 *  http://www.phpclasses.org/browse/file/3631.html 
 -------------------------------------------------------------------  
 *  Run on PHP versions 4 and 5
 -------------------------------------------------------------------
 *  Apprain : Content Management Framework <http://www.apprain.com/>
 *  Download link: http://www.apprain.com/download
 *  Docs link: http://www.apprain.com/docs
 -------------------------------------------------------------------
 *  License text http://www.opensource.org/licenses/mit-license.php 
 *  About MIT license <http://en.wikipedia.org/wiki/MIT_License/>
*************************************************************************************************/
class zipfile
{
    /**
     * Array to store compressed data
     *
     * @var  array    $datasec
     */
    var $datasec      = array();

    /**
     * Central directory
     *
     * @var  array    $ctrl_dir
     */
    var $ctrl_dir     = array();

    /**
     * End of central directory record
     *
     * @var  string   $eof_ctrl_dir
     */
    var $eof_ctrl_dir = "\x50\x4b\x05\x06\x00\x00\x00\x00";

    /**
     * Last offset position
     *
     * @var  integer  $old_offset
     */
    var $old_offset   = 0;


    /**
     * Converts an Unix timestamp to a four byte DOS date and time format (date
     * in high two bytes, time in low two bytes allowing magnitude comparison).
     *
     * @param  integer  the current Unix timestamp
     *
     * @return integer  the current date in a four byte DOS format
     *
     * @access private
     */
    function unix2DosTime($unixtime = 0) {
        $timearray = ($unixtime == 0) ? getdate() : getdate($unixtime);

        if ($timearray['year'] < 1980) {
        	$timearray['year']    = 1980;
        	$timearray['mon']     = 1;
        	$timearray['mday']    = 1;
        	$timearray['hours']   = 0;
        	$timearray['minutes'] = 0;
        	$timearray['seconds'] = 0;
        } // end if

        return (($timearray['year'] - 1980) << 25) | ($timearray['mon'] << 21) | ($timearray['mday'] << 16) |
                ($timearray['hours'] << 11) | ($timearray['minutes'] << 5) | ($timearray['seconds'] >> 1);
    } // end of the 'unix2DosTime()' method


    /**
     * Adds "file" to archive
     *
     * @param  string   file contents
     * @param  string   name of the file in the archive (may contains the path)
     * @param  integer  the current timestamp
     *
     * @access public
     */
    function addFile($data, $name, $time = 0)
    {
        $name     = str_replace('\\', '/', $name);

        $dtime    = dechex($this->unix2DosTime($time));
        $hexdtime = '\x' . $dtime[6] . $dtime[7]
                  . '\x' . $dtime[4] . $dtime[5]
                  . '\x' . $dtime[2] . $dtime[3]
                  . '\x' . $dtime[0] . $dtime[1];
        eval('$hexdtime = "' . $hexdtime . '";');

        $fr   = "\x50\x4b\x03\x04";
        $fr   .= "\x14\x00";            // ver needed to extract
        $fr   .= "\x00\x00";            // gen purpose bit flag
        $fr   .= "\x08\x00";            // compression method
        $fr   .= $hexdtime;             // last mod time and date

        // "local file header" segment
        $unc_len = strlen($data);
        $crc     = crc32($data);
        $zdata   = gzcompress($data);
        $zdata   = substr(substr($zdata, 0, strlen($zdata) - 4), 2); // fix crc bug
        $c_len   = strlen($zdata);
        $fr      .= pack('V', $crc);             // crc32
        $fr      .= pack('V', $c_len);           // compressed filesize
        $fr      .= pack('V', $unc_len);         // uncompressed filesize
        $fr      .= pack('v', strlen($name));    // length of filename
        $fr      .= pack('v', 0);                // extra field length
        $fr      .= $name;

        // "file data" segment
        $fr .= $zdata;

        // "data descriptor" segment (optional but necessary if archive is not
        // served as file)
        $fr .= pack('V', $crc);                 // crc32
        $fr .= pack('V', $c_len);               // compressed filesize
        $fr .= pack('V', $unc_len);             // uncompressed filesize

        // add this entry to array
        $this -> datasec[] = $fr;

        // now add to central directory record
        $cdrec = "\x50\x4b\x01\x02";
        $cdrec .= "\x00\x00";                // version made by
        $cdrec .= "\x14\x00";                // version needed to extract
        $cdrec .= "\x00\x00";                // gen purpose bit flag
        $cdrec .= "\x08\x00";                // compression method
        $cdrec .= $hexdtime;                 // last mod time & date
        $cdrec .= pack('V', $crc);           // crc32
        $cdrec .= pack('V', $c_len);         // compressed filesize
        $cdrec .= pack('V', $unc_len);       // uncompressed filesize
        $cdrec .= pack('v', strlen($name) ); // length of filename
        $cdrec .= pack('v', 0 );             // extra field length
        $cdrec .= pack('v', 0 );             // file comment length
        $cdrec .= pack('v', 0 );             // disk number start
        $cdrec .= pack('v', 0 );             // internal file attributes
        $cdrec .= pack('V', 32 );            // external file attributes - 'archive' bit set

        $cdrec .= pack('V', $this -> old_offset ); // relative offset of local header
        $this -> old_offset += strlen($fr);

        $cdrec .= $name;

        // optional extra field, file comment goes here
        // save to central directory
        $this -> ctrl_dir[] = $cdrec;
    } // end of the 'addFile()' method


    /**
     * Dumps out file
     *
     * @return  string  the zipped file
     *
     * @access public
     */
    function file()
    {
        $data    = implode('', $this -> datasec);
        $ctrldir = implode('', $this -> ctrl_dir);

        return
            $data .
            $ctrldir .
            $this -> eof_ctrl_dir .
            pack('v', sizeof($this -> ctrl_dir)) .  // total # of entries "on this disk"
            pack('v', sizeof($this -> ctrl_dir)) .  // total # of entries overall
            pack('V', strlen($ctrldir)) .           // size of central dir
            pack('V', strlen($data)) .              // offset to start of central dir
            "\x00\x00";                             // .zip file comment length
    } // end of the 'file()' method
    

    /**
     * A Wrapper of original addFile Function
     *
     *
     * @param array An Array of files with relative/absolute path to be added in Zip File
     *
     * @access public
     */
    function addFiles($files /*Only Pass Array*/)
    {
        foreach($files as $file)
        {
			if (is_file($file)) //directory check
			{
				$data = implode("",file($file));
				$this->addFile($data,$file);
			}
			
        }
    }
    
    /**
     * A Wrapper of original file Function
     *
     *
     * @param string Output file name
     *
     * @access public
     */
    function output($file = NULL)
    {
		if( isset($file))
		{
			$fp=fopen($file,"w");
			fwrite($fp,$this->file());
			fclose($fp);
		}
		else
		{
			header('Content-type: application/zip');
			header('Content-Disposition: attachment; filename="downloaded.zip"');
			echo $this->file();
		}
    }

    function read_zip($name)
	{
		// Clear current file
		$this->datasec = array();

		// File information
		$this->name = $name;
		$this->mtime = filemtime($name);
		$this->size = filesize($name);

		// Read file
		$fh = fopen($name, "rb");
		$filedata = fread($fh, $this->size);
		fclose($fh);

		// Break into sections
		$filesecta = explode("\x50\x4b\x05\x06", $filedata);

		// ZIP Comment
		$unpackeda = unpack('x16/v1length', $filesecta[1]);
		$this->comment = substr($filesecta[1], 18, $unpackeda['length']);
		$this->comment = str_replace(array("\r\n", "\r"), "\n", $this->comment); // CR + LF and CR -> LF

		// Cut entries from the central directory
		$filesecta = explode("\x50\x4b\x01\x02", $filedata);
		$filesecta = explode("\x50\x4b\x03\x04", $filesecta[0]);
		array_shift($filesecta); // Removes empty entry/signature

		foreach($filesecta as $filedata)
		{
			// CRC:crc, FD:file date, FT: file time, CM: compression method, GPF: general purpose flag, VN: version needed, CS: compressed size, UCS: uncompressed size, FNL: filename length
			$entrya = array();
			$entrya['error'] = "";

			$unpackeda = unpack("v1version/v1general_purpose/v1compress_method/v1file_time/v1file_date/V1crc/V1size_compressed/V1size_uncompressed/v1filename_length", $filedata);

			// Check for encryption
			$isencrypted = (($unpackeda['general_purpose'] & 0x0001) ? true : false);

			// Check for value block after compressed data
			if($unpackeda['general_purpose'] & 0x0008)
			{
				$unpackeda2 = unpack("V1crc/V1size_compressed/V1size_uncompressed", substr($filedata, -12));

				$unpackeda['crc'] = $unpackeda2['crc'];
				$unpackeda['size_compressed'] = $unpackeda2['size_uncompressed'];
				$unpackeda['size_uncompressed'] = $unpackeda2['size_uncompressed'];

				unset($unpackeda2);
			}

			$entrya['name'] = substr($filedata, 26, $unpackeda['filename_length']);

			if(substr($entrya['name'], -1) == "/") // skip directories
			{
				continue;
			}

			$entrya['dir'] = dirname($entrya['name']);
			$entrya['dir'] = ($entrya['dir'] == "." ? "" : $entrya['dir']);
			$entrya['name'] = basename($entrya['name']);


			$filedata = substr($filedata, 26 + $unpackeda['filename_length']);

			if(strlen($filedata) != $unpackeda['size_compressed'])
			{
				$entrya['error'] = "Compressed size is not equal to the value given in header.";
			}

			if($isencrypted)
			{
				$entrya['error'] = "Encryption is not supported.";
			}
			else
			{
				switch($unpackeda['compress_method'])
				{
					case 0: // Stored
						// Not compressed, continue
					break;
					case 8: // Deflated
						$filedata = gzinflate($filedata);
					break;
					case 12: // BZIP2
						if(!extension_loaded("bz2"))
						{
							@dl((strtolower(substr(PHP_OS, 0, 3)) == "win") ? "php_bz2.dll" : "bz2.so");
						}

						if(extension_loaded("bz2"))
						{
							$filedata = bzdecompress($filedata);
						}
						else
						{
							$entrya['error'] = "Required BZIP2 Extension not available.";
						}
					break;
					default:
						$entrya['error'] = "Compression method ({$unpackeda['compress_method']}) not supported.";
				}

				if(!$entrya['error'])
				{
					if($filedata === false)
					{
						$entrya['error'] = "Decompression failed.";
					}
					elseif(strlen($filedata) != $unpackeda['size_uncompressed'])
					{
						$entrya['error'] = "File size is not equal to the value given in header.";
					}
					elseif(crc32($filedata) != $unpackeda['crc'])
					{
						$entrya['error'] = "CRC32 checksum is not equal to the value given in header.";
					}
				}

				$entrya['filemtime'] = mktime(($unpackeda['file_time']  & 0xf800) >> 11,($unpackeda['file_time']  & 0x07e0) >>  5, ($unpackeda['file_time']  & 0x001f) <<  1, ($unpackeda['file_date']  & 0x01e0) >>  5, ($unpackeda['file_date']  & 0x001f), (($unpackeda['file_date'] & 0xfe00) >>  9) + 1980);
				$entrya['data'] = $filedata;
			}

			$this->files[] = $entrya;
		}

		return $this->files;
	}
} // end of the 'zipfile' class
?>

Here is an example how to use this class

<?php

 require_once 'rightmove.class.php';
 require_once 'zip.class.php';

 $rmparser = new RightmoveParser();
 $rmparser->folder = "./files/";	//foldername to scan (foldername with trailing slash)
 $rmparser->temp_loc = "./tmp/";	//temporary file location	(path with trailing slash)
 $rmparser->image_loc = "./images/";	//image location to copy images		(path with trailing slash)
 $rmparser->archive_loc = "./archives/";	//archive location (path with trailing slash)
 $rmparser->keep_source_file = true;		//if we want to keep source file, value will be true neither false

 /* to handle zip file directly
 $rmparser->scan_folder=false;
 */

 $rmparser->rmfile = "./files/61.zip";


 $rmdata = $rmparser->getPropertyData();
 print_r($rmdata);

?>

Hope this will be helpful. Thnx

Advertisements

42 thoughts on “Rightmove data parsing php class

  1. Hi Musa. This is excellent work, very useful. But the files are not always a zip file. I have a number of estate agent sites that feed a .blm file to Right Move. I am now working on script to import a .blm file back into the estate agent database. So I think what you have done is excellent but needs the ability to process .blm direct also, not just .zip.

  2. This looks great but I keep getting this error:
    Parse error: syntax error, unexpected T_STRING, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or ‘}’ in …. on line 6
    Any ideas?

    • Hi Matt Short,
      I didn’t face such problem, i did testing again with sample data – it’s ok with those, send me your code with data at musa_bd[at]yahoo[dot]com – i can check

    • Does chocolate grow on chocolate trees and if it does would this still work? I need to know as I believe chocolate could affect the results of this. If the keyboard get sticky with chocolate for example.

  3. Fantastic class, thanks.
    My images dont seem to get copied to the right folder though, any known reason behind this?
    Also, is there a version which takes BLM files directly unzipped?

    Thanks.

  4. does any one have a working sample of this that they can make available for download so i can see were i have gone wrong as i still can not get it to work?
    thanks in advance

  5. Hi,
    Thanks for this great parser, it’s coming it really useful!

    Did you ever make those changes to accept a regular blm file instead of a zip file? I’ve tried a few things but only get “No Data Found”.

  6. Hi,
    Thanks for this code – has saved me hours of work.

    I am using these on a cron to import properties into a Drupal built site.
    The BLM file is read correctly and all the ‘ property nodes’ import with the correct data into the property content type BUT I have hit a stumbling block for all the images!
    The error log contains the following message for every image file ‘gzinflate(): data error in /var/www/vhosts/../inc/zip.class.php on line 303.

    All of the images get copied over but they are all ‘zero’ bytes so it seems that the images are not getting extracted from the zip correctly. I have downloaded the zip and manually extracted the images and all are OK so I cant see a problem with the zip?

    Any ideas as to what is causing the above error? All help massively appreciated as I am running out of ideas?

    Thanks – Russ

  7. Great code..thanks!!
    added this

    echo '

    ';
    print_r($rmdata);
    echo '

    ';

    Formats the print_r function a bit better..

    Question is now that i have my beautiful output from my blm zip file how can i echo / display the variables from the array individually on a web page?

    May sound silly but im a designer not a developer..

  8. Thanks for all your hard work on this works so much better than mine did.

    I would also like to know if you have made the changes you said about doing the BLM directly,also do have an example to how I would import the data via an xml template to create pages?

  9. The web works in a Request/Response cycle. A request is made from clicking a link or typing a URL string in the browser. A remote computer and file are called. The registration.php file then collects the data from the URL string, and begins processing your request. Once the Request has been processed, a Response is created. This Response is sent back to you in the form of a web page.

    • Hi… I also want to save the Blm file in database bt cant get the point where to insert the Insertion code and how… Can u help me a bit with that..

  10. I understand this was posted a while ago – But could you point me in the direction as to why i get a “No Data Found” Error. I can’t seem to fathom why that would be.

    Thanks,

    Sean

  11. Great class, thanks! To ensure compatibility with uppercase file extensions, it might be a good idea to change line 177 to:

    if(strstr(strtolower($file),”.zip”)){

    and line 218:

    if (strstr(strtolower($fileinfo[‘name’]), “.blm”)) {

  12. Hi… I also want to save the Blm file in database bt cant get the point where to insert the Insertion code and how… Can u help me a bit with that..

  13. This may not be the best place to ask this, but I want to know if I can have a short sale and I don’t know how to find a professional listing realtor… have you ever heard of this realtor? They’re located in sacramento, 20 min from my office and I can’t find reviews on them – Becky Lund & Associates – Sacramento Realtors, 8814 Madison Avenue #2 Fair Oaks, CA 95628 (916) 531-7124

  14. Hi,

    I have used this code but its not working for me. please help me to insert bulk properties import in my website.

    Thanks,
    Manjit Singh

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s