Proceedings of the 6th ESRI South Asia Users' Conference
Shangri-La Hotel, Penang, Malaysia
8 -10 September 1997

Creating Large Digital Maps
for Municipal Planning Applications Using Desktop GIS

- The Penang Experience



Lee Lik Meng, Ph. D.
School of Housing, Building & Planning
Universiti Sains Malaysia






Cadastral Map

Zoning Plans


The Procedure for Creating The Digital Cadastral Map

ArcView Interface

Filenaming Conventions

The Procedure for Creating The Digital Zoning Plan







A digital cadastral map for the entire Penang Island has been created using desktop technology, namely, 486 and Pentium PCs, pc Arc/Info and ArcView. The map was created from over 200 standard survey sheets of various scales and presently contains more than 50,000 land parcels covering the entire area of local planning authority. Using this base, a digital zoning plan was created using the editing tools of ArcView. The digital maps are part of an overall system for a GIS-based planning system being implemented on the local area network of the Town Planning Department, Municipal Council of Penang Island



As we embrace and pursue the vision of the Multimedia Super Corridor (MSC) with multi-million dollar investments in seven flagship applications including electronic government, we must realise that each individual or organisation respond in its own way to the imminence of the Information Age. Not surprisingly, it is not all plain sailing with a mixture of successes and failures in IT projects. Just recently, it was reported in the media that a Perak Government-back public-private joint venture IT company squandered RM 5 million (In-Tech, 26 Aug 97) with little to show in terms of products or applications. Elsewhere, government departments are spending several hundred thousand and even millions of Ringgit on pilot projects which in turn generate other pilot projects or fail to demonstrate their operationality within the business activities of the government departments. On the other hand, there are also departments which are cautious and prudent in their funding of IT projects, sometimes with adverse results such as acquisition of under-powered systems unable to cope or perform the functions required.

In the area of town and country planning, the failure rate is higher than successes. There are various reasons for the generally low success rate, amongst them : lack of in-house expertise; improper or inadequate advice from vendors; inability to mobilise staff; reluctance in sharing resources between departments; lack of funds and commitment; improper use of technology resulting in increased workload; and, lack of proper needs assessment, user input and documentation (see Lee, et al, 1996).

In the case of the Town Planning Department of the Municipal Council of Penang Island (MPPP), it acquired the pc Arc/Info software, an A0-sized digitiser, large format printer and a 486 PC with 4 Megabytes of RAM sometime in 1995 and attempted to get its GIS system operational. After about a year, the department realised that its in-house expertise was inadequate and hence approached the USM Team for assistance and a pilot project was initiated in July 1996.

In this follow-up to an earlier paper (Lee, et al, 1996), I am happy to report the successful completion of various components of the pilot project. This paper will, however, only highlight and discuss the problems, issues and resolutions related to the creation of two digital maps, namely, the cadastral map and zoning plan for the entire Island of Penang.

Specifically, this paper highlights the successful completion of data capture relying entirely on desktop technology with the promise of a cost effective solution to agencies anxious to keep start-up cost low until such time when it is confident enough to embrace the more powerful and more expensive systems.



The pilot project involves two major components, namely: (a) the development of a computer-based system for the processing of applications for planning permission using a relational database system, and (b) the creation of two GIS-based digital maps (cadastral and zoning) with a simple customised enquiry system using ArcView. An important requirement for the project is that the two components must be capable of being linked together to form the basis for a GIS-based Planning System for the Town Planning Department. Currently, the planning permission system using Microsoft Access 97 has been installed into the computers at the Town Planning Department for testing. It will eventually be linked to the GIS component.

The Department’s computer system comprises 8 new Pentium 166 MHz machines with 32 Megabyte RAM each running Windows 95 as the primary network for file sharing. The Department’s network is based on CAT 5 UTP cabling and a 10 Base-T HUB and is connected to MPPP’s fiber optic backbone and is therefore ready for an enterprise-wide system.





Cadastral Map

A total of 135 standard map sheets was digitised by the USM Team comprising :

74 sheets of 1 inch : 4 chain maps (outside of Georgetown)
53 sheets of 1 inch : 80 feet maps (western part of Georgetown), and
8 sheets of 1 inch : 1 chain (Balik Pulau Town).

In addition, a section of Georgetown (comprising about 100 map sheets of 1 inch : 40 feet) had previously been digitised by the Penang State Town and Country Planning Department (JPBD) with the facilities at PEGIS (Penang GIS Centre). To initiate and promote the culture of data-sharing, it was agreed that this data will be integrated into the work carried out by the USM Team. However because of problems with edgematching and joining the JPBD/PEGIS section with the rest of the Island, the USM Team undertook to redigitise portions of the 16 map sheets (1 inch : 40 feet) on the border of the two sections. The entire digital cadastral map of Penang Island was therefore created from more than 200 map sheets of different scales. A hardcopy of the Cadastral Map of Pulau Pinang generated from ArcView using a colour inkjet printer is included at the end of this paper.

The creation of the digital cadastral map of Penang Island involved nine major steps typical of most data capture process in a GIS system using the digitising method (see discussion below). Currently, the digital cadastral has the parcel geometry and lot numbers with administrative district codes to provide the link to other (including non-GIS) databases and applications. The database includes only land parcels shown on the standard survey sheets acquired from the Survey Department. Hence, it does not include provisional lots, land in the process of subdivision or land with final title but not yet indicated on the standard sheets.


Zoning Plans

Zoning plans are instruments of the local planning authority to guide and control permissible developments or land use for each parcel of land (or parts thereof). As such, zoning plans are produced on cadastral base maps. In the case of MPPP, a composite of cadastral sheets were joined to form seven separate sheets of 1 inch : 32 chains maps covering the whole Island. These map sheets are then manually painted using Ecoline colours. Typically, to produce one set of zoning plans requires 3 or 4 technicians working for a period of 3 to 4 weeks. The process is obviously tedious, error prone and time-consuming.

As part of the pilot project, a digital version of the zoning plans have been created based on the digital cadastral maps created earlier. The USM Team relied entirely on the editing capabilities of ArcView 3.0 to undertake this task. The digital zoning plan is almost complete and takes merely an hour or less to print the entire map of the Island with a large format printer. With this digital version, future amendments and updates can be completed in a fraction of the time compared with the current manual method. But more important is the fact that the digital zoning map has the same geometric base as the digital cadastral and can be integrated or linked to non-GIS based planning application systems which will benefit from the spatial analysis and search capabilities of the GIS. Planners and technicians will have on-line access to the zoning plans as they go about processing applications for development approval thus increasing efficiency.





The Procedure for Creating The Digital Cadastral Map


The procedure for creating the digital cadastral map of Penang Island is as shown in figure 1. It is based on the utilities available in pc Arc/Info version 3.4.2b and is typical for any project which uses the digitising approach for data capture.

The stages are briefly described below (refer Figure 1), including tips and pitfalls to avoid :


Figure 1
Procedure for Creating Digital Cadastral Map

This is the primary method of capturing the geometric boundaries of the land parcels from standard survey sheets used in this project. An A0-sized digitiser connected to a 486 PC with 16 Megabytes RAM running pc Arc/Info’s ADS was used for the entire project.

Creati2.jpg (32943 bytes)

A set of instructions was developed and adhered to by the staff. These included the instructions to pace the interval between "clicks" of the keys to allow sufficient time for the computer to register and respond to the signals (yes, the system hangs if you click too fast in succession); anti-clockwise direction for digitising arcs and polygons; and progression from upper-left corner to lower right-corner of maps. The digitising staff were also trained to recognise the lines which constitute the legal boundary of land parcels since these maps also include numerous other features such as rivers, road pavements, drains, etc. These rules help to ensure systematic coverage of the map and reduces or eliminates omissions of lot boundaries or inclusion of non-cadastral features.
To ensure geometric integrity, each change in direction of arcs (as denoted by cornerstones of lot boundaries) are digitised as either a node or vertex. For regular blocks of subdivision such as terrace lots, the outline of the entire block were digitised first and then each common lot boundary was digitised with deliberate overshoot. When cleaned later, the overshoots would be removed to produce regular rows of subdivided lots. Instructions were also given to close the polygons for roads and access. Each map sheet is also digitised with a boundary box coinciding with the extent of coverage of a standard sheet as defined by the four control points (tics) of each map. Lots on the edges of the maps extending into adjoining maps are also digitised with deliberate overshoots. When cleaned, these partial lots will form polygons with the boundary box. The creation of polygons for these partial lots is necessary to facilitate mapjoin and merging of lots across map sheets.


This is a fairly straightforward step to convert the digitised map into a GIS-layer, that is, creating topology for the map features. "Clean" is used primarily to remove the overshoots for the regular blocks as well as for the partial lots on the map edges. "Build" is often used after the initial "clean" if there is a need to recreate topology as a result of subsequent editing.



Finally, to complete the process of creating topology, each polygon (lot) must be assigned a label point to which attributes (such as lot numbers) can be added. This is again straightforward with autolabeling used (the command in pc Arc/Info is CREATELABELS). In general, the Label IDs are not significant for the purposes of creating the digital cadastral layer because the lot numbers are used for purposes of merging the partial lots across map sheets. However, there are several points to note. Firstly, it is critical to constantly check that the number of label points is always one less than the number of polygons. This is because every layer must have an external polygon (the universe) which has the label ID of 0 (zero) with no label point. Any other value means that there are polygons without label points or polygons with multiple labels points. This value can be checked with the command DESCRIBE at the Arc prompt. On the other hand, even if the "number of label point is one less than number of polygons" it may not necessarily mean everything is in order. Another point to note about label IDs is that in some operations such as DISSOLVE, new IDs are generated for all polygons. Hence, it is useless to enforce any convention for assigning user-defined label IDs if such commands are likely to be used in subsequent operations. Finally, if the Arc/Info coverages are subsequently converted to shapefiles in ArcView, these label IDs can be removed as they do not have any more function.



When the maps are first digitised, the measurements recorded by the GIS are in digitiser units, that is, in number of inches measured from the origin (0,0) on the lower-left corner of the digitiser. Transformation is required to convert the digitiser units to real-world coordinates (e.g. number of feet or meters from an identified origin; or longitude and latitude degrees). Once transformation has been carried out, subsequent edits to the digital maps do not require transformation (it only requires "clean" or "build" to recreate topology). Since the cadastral maps have been projected using the Cassini system, there is no need for the maps to be projected. The entire digital cadastral map for Penang Island is transformed on the metric scale.


During this stage, each digitised map is clipped to create coverages which fit exactly to the boundary of a standard sheet as defined by the four control (tic) points. This process is necessary to ensure that lots on the map edges form closed polygons in order to facilitate the mapjoin and dissolve operations later to create the seamless map of Penang Island. It is noted that unclosed polygons will not be visible in ArcView and will not be included when the coverages are converted to shapefiles. The maps as it stands now contain the geometric shapes of each land parcel. The next stage adds attributes, i.e. lot numbers.

Additem (Lot Numbers)

Additem is an Arc/Info command to add additional fields (items) to the coverage in order for attributes to be attached to polygons to further describe them. It is also possible to add new fields to Arc/Info coverages using ArcView.

For this project, ArcView’s graphical interface was used to manually key in the lot numbers. By opening the relevant coverage (theme) and its associated tables simultaneously in ArcView, the data-entry personnel selects each lot (polygon) on the map (theme) and the relevant record in the associated table will be highlighted. The personnel checks the lot number from the paper map manuscript and then keys-in the number directly into the table for that record. A 6-digit "string" is used for all lot numbers in conformance with the Survey and Mapping Department’s specifications. Additional zeros are added to the front of lot numbers to pad them to 6-digits. At this point of the data capture process, the administrative district codes are not added yet. Note that lot numbers themselves are not unique (i.e. that are repeated in other administrative areas).


Figure 2
Edgematching Problem and Solution

Due to the nature of the paper map manuscripts, the lots which sit across adjoining map sheets generally do not match perfectly when the sheets are placed next to each other. Edgematching is the process by which the arcs which form a polygon across adjoining coverages are aligned to meet at the map edges. Arc/Info provides an automated process but if the displacement of arcs is considerable this process will distort the shape of the polygon. In such cases, the arcs are manually aligned using Arcedit tools. In the absence of good quality manuscripts on a stable medium, this is the only option to achieve geometric integrity. Figure 2A show two adjacent mapsheets before edgematching while figure 2B shows the results if automatic edgematching is activated. Note the geometric distortion in figure 2B. This problem required manual edits to align the arcs with the results shown in figure 2C. The three graphics where captured from ArcView at 1:1900 scale.

LargeD1.jpg (62115 bytes)


This is the first step in the process whereby the numerous map sheets are joined together to create a seamless map. Such a map is necessary to permit spatial analysis and query across the artificial and arbitrary limits of standard map sheets on paper. The mapjoin command in Arc/Info rebuilds topology and as such is subject to the same limitations of the build and clean commands. Specifically, the critical limitation in PC Arc/Info is that a "polygon must not have more than 5000 arcs" (see further discussion in "Problems, Issues and Solutions").

An alternative to mapjoin is to use the update command. It achieves the same objective of joining the maps. The difference is that mapjoin retains the original polygon IDs of each coverage while update regenerates new IDs for the new joined coverage. (Update is equivalent to the cut and paste technique and is used mainly for updating sections of coverages). A third command is append which merely puts all the coverages together but does not rebuild topology, i.e., append requires a second step to either clean or build the new coverage.

A deliberate process of progressively joining and dissolving several contiguous map sheets was adopted in order to learn and adapt from mistakes or discover more efficient ways to undertake the various tasks. The USM Team was also cautiously testing and pushing the limits of the pc Arc/Info software as well as the computer.




Even though the mapjoin command creates a seamless map, the lots which sit across adjoining maps are retained as separate polygons. The dissolve command is used to merge these polygons in a single lot using lot numbers as the criteria for the operation. Note that dissolve operates over the entire coverage and not only at the joints. Even though lot numbers are repeated in the cadastral maps the operation will only merge adjacent polygons and there is no danger of an illegal merge because lot numbers are unique at the lowest administrative unit (i.e. mukim or town section). Nevertheless, we have come across adjacent lots with the same number but belonging to different administrative units.



This is the first step in the process whereby the numerous map sheets are joined together to create a seamless map. Such a map is necessary to permit spatial analysis and query across the artificial and arbitrary limits of standard map sheets on paper. The mapjoin command in Arc/Info rebuilds topology and as such is subject to the same limitations of the build and clean commands. Specifically, the critical limitation in PC Arc/Info is that a "polygon must not have more than 5000 arcs" (see further discussion in "Problems, Issues and Solutions").

An alternative to mapjoin is to use the update command. It achieves the same objective of joining the maps. The difference is that mapjoin retains the original polygon IDs of each coverage while update regenerates new IDs for the new joined coverage. (Update is equivalent to the cut and paste technique and is used mainly for updating sections of coverages). A third command is append which merely puts all the coverages together but does not rebuild topology, i.e., append requires a second step to either clean or build the new coverage.

A deliberate process of progressively joining and dissolving several contiguous map sheets was adopted in order to learn and adapt from mistakes or discover more efficient ways to undertake the various tasks. The USM Team was also cautiously testing and pushing the limits of the pc Arc/Info software as well as the computer.

This process is relatively straightforward provided certain pitfalls are avoided. For example, polygons adjacent to each other but not given a lot number (i.e. left blank) because of insufficient information will be merged. This will also happen if these polygons are all given an "na" value (for "not available"). This can be resolved by giving each of these polygons sufficiently unique values such as "na1", "na2", etc. If polygons for roads and access are all labeled "road" (or "jalan") they will be merged into one continuous polygon (since all public roads are joined to each other!). In the case of Penang Island, this caused the dissolve operation to bail out with the uninformative message that a polygon has "more than 5000 arcs" (see further discussion later in "Problems, Issues and Solutions"). Quite often, the ELIMINATE command maybe required to get rid of slivers created during the mapjoin operations since it is very difficult to create boundary boxes which fit exactly to the map extent.

When the dissolve operation has been successfully carried out, the result is a seamless map.

Add Administrative Area Codes


At this stage, a shapefile of the dissolved map is created and used as the basis for delineating the administrative boundaries (mukims, town sections, etc.). This is done by merging all the polygons within an administrative area into a single polygon. Each of the administrative areas is then assigned a unique code. Using the spatial query facility of ArcView, this administrative area map is used to select all the lots which fall into the respective administrative areas and globally assigned the administrative area code. This map can be used in subsequent analysis to retrieve or query the maps according to administrative areas even from maps which may contain data which are not based on lots (including point and line data).

Image1.gif (5878 bytes)

Future updates of the maps to include new subdivisions will also be facilitated as administrative area boundaries are rarely changed except by the creation of say, new gazetted townships. In such a case, the new boundaries can be drawn over the existing administrative unit layer and used to update the administrative area codes for the affected land parcels. This approach of using an administrative area layer to input and update administrative codes is preferred as it ensures that every land parcel within the administrative area will be assigned the same code. It also reduces the amount of repetitive keying-in required as well as reduces the size of the database.

Figure 3 shows the coding system developed by the Federal Land Survey and Mapping Department and is currently in use in Penang State. It generates a 15-digit (string) code that uniquely identifies each parcel of land in the country. The USM Team has adapted the system by assigning a sequential code to each of the administrative units (named UnitTadbirID). This UnitTadbirID code together with the 6-digit lot numbers also uniquely identifies each land parcel on Penang Island. The advantage of having a single code (with only 2 digits) instead of a combination of 4 separate fields (making up 9 digits) is that it reduces data entry and shortens the field required for linking the tables and database. It is a more efficient form of data storage and retrieval commonly used even for ArcView and other GIS and relational database applications. Of course, at the national level, the large number of such administrative units may render the single code system less beneficial or even redundant.

A lookup table of the all the codes was created in the counterpart application in Microsoft Access. This table was then exported and is available to ArcView to be linked to the digital cadastral layer for display and retrieval purposes.



ArcView Interface

Finally, the pc Arc/Info coverage is converted into an ArcView shapefile to take advantage of the dramatic improvement in the speed of display and refresh of shapefiles compared to Arc/Info coverages. And as was previously mentioned, shapefiles do not have any limitation in terms of number of arcs per polygon since each polygon is in fact an object rather than a series of primitive entities (arcs, points). Hence, the ugly pseudo-polygons mentioned above can be remove to truly display the cadastral geography of Penang Island. We also converted the JPBD/PEGIS workstation Arc/Info coverage into a shapefile in order to merge the Georgetown portion with the rest of the Island

The above procedures and steps were generally adhered to by the USM Team. Nevertheless, some other additional steps had to be taken when the team had to backtrack to capture more detailed maps (i.e. using smaller scale maps) for areas such as the Fettes Park/Tg. Tokong area, the western part of Georgetown and Balik Pulau Town. The additional steps required included removing sections (cut) where the newly digitised maps (smaller scale) overlapped with previously digitised maps (larger scale) before an update (paste) is made. This is necessary because the difference in scale (and hence accuracy and detail) and the varying degree of manuscript distortion usually results in the non-coincidence of polygons from the two layers. An alternative approach is to undertake an update first and then eliminate the resultant slivers but since the entire section was being replaced, it is much neater and easier to remove the relevant section from the earlier map.

It is noted that the JBPD/PEGIS approach for capturing the eastern part of Georgetown differs from our approach. It is understood that all the standard mapsheets were first digitised and then appended together into a single layer and subsequently "cleaned" to create topology. Polygons at the edges of individual maps were not closed but the "cleaning" process created intersects, removed dangles or overshoots and closed these polygons. Additional manual edits to close polygons were undertaken where necessary. There are two critical requirements for this approach to be successful, namely: (a) the arcs for polygons at the edges are extended sufficiently to ensure that they will intersect with its corresponding polygon on the adjacent map; and (b) the original map manuscripts are of extremely high quality with little or no geometric distortion caused by shrinkage or stretching.

Filenaming Conventions

It is quite obvious that based on each individual map sheet digitised for the project more than a hundred separate files would have been created just from the first round of digitising alone. In addition, to facilitate disaster recovery in case of errors in processing or inadvertent loss due to data corruption, a historical series of files corresponding to each of the major stages of data capture as discussed in the previous section was stringently adhered to.

The following filenaming convention was adopted for easy recognition and retrieval of the relevant files (table 1).


Stage of Data Capture

Filename Convention

(Limited to 8 characters)

Example of Filename


LT + Sheet No. + D1



LT + Sheet No. + C1



LT + Sheet No. + L1



LT + Sheet No. + T1



LT + Sheet No. + KP



No new files


J + Starting & Ending Sheet Nos.



JD + Starting & Ending Sheet Nos.



The filenames must conform to the DOS limit of not more than 8 characters. Hence, in certain cases the names may have to shortened especially for the mapjoin and dissolved files. The letters associated with the filenames are intuitive. For example, LT is short to "lots", D for "digitising", C for "clean", J for "join" and so on.



The Procedure for Creating The Digital Zoning Plan

The digital cadastral map was used as the basis for creating the zoning plan, using entirely ArcView editing capabilities. An ArcView shapefile of the cadastral map was duplicated and then split into seven separate files corresponding to the area of coverage of the seven paper-based zoning plans. This procedure allowed several staff to work on creating the digital zoning maps at the same time. It also considerably reduced the size of the working file and hence facilitated faster editing, refresh and saving of files. It is strongly recommended that such work should be carried out on sub-sets of the data to reduce data corruption which can occur quite frequently if a large amount of edits are made between saves and the system crashes due to memory problems.

The procedure for creating the zoning map involved the attachment of a 3-character code to a new field in the designated zoning shapefile. The codes correspond to each of the legal definitions of a land use category or zone in the approved zoning plan. A legend is then created and saved in an "avl" file to be used by all the staff working independently on separate machines. When loaded, the maps will be displayed according to the colours and patterns specified in the legend file.

Most of the work involved straightforward selection of the appropriate lots, cross checking with the paper zoning plans and keying-in the codes for the zones in the table associated with the shapefile. In most situations, multiple lots for a large section of the map can be selected simultaneously and the zoning codes assigned in a single operation using the CALCULATE operation to update the zoning field in the table. Hence, the process is much faster and simpler than the creation of the cadastral map.

Nevertheless, in many situations, further work to split the lots have to be undertaken. For instance, the zoning plan includes proposed roads which may cut across the lots, road widening proposals which also requires lots adjacent to existing roads to be split and cases where there are multiple land use zones for the same land parcel. Since these additional features are merely indicative of the local authority’s intentions and a high level of accuracy is not important all the editing work to create the new features were undertaken through "screen digitising" using the geometric shape of the lots as guides. A lot of eye judgment is therefore required. For such work, technicians with draughting experience is a definite advantage. However, the process of screen digitising can be very strenuous on the eye.

For more complicated areas where there are no sufficient reference points, the paper zoning maps were scanned and registered to the shapefiles to provide a visual reference in the background for on-screen digitising. However, because the zoning plans are on a very large scale (1 inch : 32 chains) the amount of distortion and displacement is very large when viewed with the shapefiles even after manipulating the world file. It was used mainly for large sections of contiguous lots or previously unmapped areas such as the coastal reclamation area where registration can be easily accomplished.




All in, about 9 months was spent capturing the cadastral maps with several backtracking to resolve issues mainly related to poor manuscripts. Typically, a well-trained technician can complete digitising one standard sheet in about 5 to 6 hours. More complex maps with intensely subdivided land may take up to 8 hours or one working day. The process of building topology is quite straightforward (cleaning and building; labeling). The tedious part is inputting the lot numbers which has to be manually done for each lot. Fortunately, ArcView’s visual interface provided the necessary tools for this. On average, it takes about 3 to 4 hours to input the lot numbers for a single map sheet.

The most taxing part of the data capture process is to edgematch the adjoining map sheets. On average it may take about 1 working-day to edgematch four adjoining mapsheets. More complicated cases requiring constant reference to the manuscripts to confirm lot boundaries and adjacent lot numbers to manually edit and align arcs and vertices may take up to 2 working-days for four adjoining mapsheets.

During the entire process, three technicians from the School of HBP, USM were involved with 2 to 3 students employed during the semester breaks to speed up the process. From experience, the staff must be conscientious and meticulous to maintain a high standard of product. It is not advisable to engage staff or workers on a piecemeal basis because it will result in inconsistent quality thereby necessitating more effort in cleaning up errors and omissions.

In the early stages of the digitising work, five 486 PCs with 16 Megabyte RAM and 200 Megabyte of harddisk space were used. These machines are suitable for the initial digitising, cleaning and labeling of single map sheets. However, they become very slow when using ArcView (for inputting the lot numbers). When the maps were progressively joined together the system became excruciatingly slow and eventually refused to cooperate when about 2/3 of the Island was joined. It was impossible to even display the joined maps in Arcedit (it bailed out of Arcedit with the message "no free channels").

Subsequently, a Pentium Pro 200 Mhz machine with 64 Megabytes RAM was used for joining and processing the maps together. The operations mapjoin, updates and dissolves operates on the entire map and as such the amount of time required to process a map is about the same for each of the operations. For example, even though we had already joined the map sheets into three major sections (north, central and south) which has considerably reduced the number of edges to be pieced together, the join operation took 3 hours. A follow-up dissolve operation on the joined map also required another 3 hours of processing time. It is thus strongly recommended that high-end PCs (Pentium Pro or Pentium II) be used for any serious work to create digital maps on such extensive scale and coverage.

As a point of reference, the cadastral map for the area outside of Georgetown prepared by the USM Team comprised about 37,550 polygons taking up about 14 Megabytes of diskspace for pc Arc/Info coverages (the ArcView shapefile is slightly smaller in filesize).

The work involved in creating the digital zoning plan was accomplished at a much faster rate. Two technicians working simultaneously completed the task in about two months.




The following are some of problems and issues encountered in the course of creating the digital maps of Penang Island. The discussion is presented to facilitate intelligent and appropriate use of the maps as well as to guide future work. It is not the intention of this paper to criticise the shortcomings of the parties concerned. In fact, given the inherently tedious and demanding nature of the manual method for up-keeping of maps, this discussion should serve to spur the push to embrace information technology for map-related functions.

  1. Paper Map Manuscripts Distorted
  2. Under the Terms of Reference (TOR), 74 sheets of 1 inch to 4 chain standard sheets were to be digitised by the USM Team. Initial investigations by the team revealed substantial distortion of the paper maps. The map extent were found to have been stretched or shrunk by as much as half an inch (which is equivalent to about 120 feet distortion on the 1 inch to 4 chain standard sheets) both along the width and height of the map, evidently due to the printing process and environmental conditions. Even linen-based maps were not spared. This was reported to the Town Planning Department but in the absence of a readily available stable medium, it was decided to that the paper maps would be used.

    Distortion in the manuscript presents two problems. Firstly, the geometric shape of the lots are affected even though areal coverage would remain the same. Secondly, edgematching of map-sheets becomes extremely tedious and requires a considerable amount of manual editing to align the arcs of polygons split across adjacent map sheets. The problem is especially acute in heavily developed areas with numerous subdivisions (e.g. terrace, semi-detached houses) at the edges of the standard sheets. This problem can be resolved if the Survey Department can sell or loan the original tracings (i.e. the plastic or Mylar sheets from which the prints are made) to undertake the digitising. An examination of the tracings at the Survey Department shows that the edges match perfectly across standard sheets (with a few exceptions). Unfortunately, the Department has discontinued sale of the tracings even to government departments but overtures from a government department with the promise of mutual benefit such as in exchange for digital copies of the map could be receptive to the department. This is particularly so in the light of the effort being made by the Survey Department to encourage other government departments to participate in its NALIS (National Land Information System) Project.

  3. Inconsistent Map Updates
  4. Under the TOR, the paper maps would be provided by the Town Planning Department, MPPP with the understanding that the maps reflect up-to-date information on land subdivision. As described in the User Needs Report, the Town Planning Department has a manual system of updating its standard sheets based on the Certified Plans circulated by the Survey Department. This is done by plotting the new subdivision in red ink on the paper maps. The parent lot boundaries are crossed out with "x’s". This update is also carried out on the tracings (bought before the Survey Department discontinued sale). This manual system is severely defective even with the dedicated effort of the Town Planning Department staff as was discovered during the process of digitising. The red lines and crosses on the mapsheets often caused confusion, especially when a particular lot has undergone several changes in lot boundaries. This is worst if the maps are printed from the tracing without the benefit of colour to differentiate the amended boundaries from the original. Human errors also resulted in inconsistent updates across adjacent map sheets. For example, a particular lot may have been updated on one map but its corresponding portion on an adjacent sheet was not. There were also instances of different lot numbers being written for the same lot across adjacent map sheets giving an initial impression that they are separate lots. This creates confusion with the potential to corrupt the data. In these situations, the USM Team had to double-check with new maps purchased directly from the Survey Department. A significant amount of time and resources were lost because of these problems. With the introduction of mapping and GIS technology, these periodic updates can be made hassle free. The Town Planning Department would merely have to buy digital copies of the Certified Plan to update its cadastral databases using GIS utilities or update their digital cadastral themselves through digitising.

    However, the problem of timeliness of updates and information is also encountered by the Survey Department. It is a well-known fact that the department is severely understaffed with substantial backlogs in carrying out the final surveys which are required before the preparation of the certified plans and the issuance of final land titles. The use of private professional surveryors has helped to alleviate the backlog but certain problems persist. One of the problems is priority or urgency accorded to each project. For instance, the roundabout at the Jalan Jelutong-Jalan Mesjid Negeri-Jalan Udini intersection has been in existence for more than 20 years. However, the standard sheet does not capture this detail and still shows the former subdivisions (i.e. there is no roundabout on the standard sheet). Apparently, there is no urgency to survey the site since there is no pressure to issue final title of ownership. Another problem is the time lag between final survey and actual plotting of the new subdivisions on the standard sheets. For instance, in the case of Bayan Baru New Town, the Penang Development Corporation (PDC) had already submitted the survey drawings and final survey had been carried out on some of the areas. However, the latest standard sheets from the Survey Department do not show any of the new subdivisions.

    Another inconsistency in updates relates to the map series of gazetted town limits. Typically, for town areas with higher intensity of development, a separate smaller scale series is prepared, e.g. for Balik Pulau Town. In these map series, the Survey Department is apparently concerned with only maintaining and updating subdivisions within the town limits. However, gazetted town limits are generally irregular in shape and do not conform nor fill-up the entire rectangular surface of the standard map sheets on the fringes of the town. As such, part of these maps will also show subdivision outside of the town limit. The problem is the Survey Department has not taken due care to ensure consistency between what is shown on the outskirts of the town maps (such as the 1 inch : 1 chain maps for Balik Pulau Town) and its corresponding series for areas immediately outside the town limit (i.e. the 1 inch : 4 chain maps). Due care and diligence must therefore be taken by the data capture team to resolve conflicts between the different map series.

  5. Map Scale
  6. Figure 4
    Perfect Alignment

    In principle, larger scale maps are preferred for purposes of digitising because they provide greater detail and higher accuracy (large scale covers a smaller area for the same size of map sheet area). As is shown in figure 4, it is possible to obtain almost perfect alignment even without using the edgematching utilities of the GIS software. This is possible when there is little or no paper shrinkage or stretching and when the lots are of large sizes and dimensions. The original maps on the left were 1 inch : 80 feet while those on the right were 1 inch : 40 feet.

    wpe3.jpg (22623 bytes)

    On the other hand, larger scales for the same areal coverage means a greater number of maps sheets to be digitised with potentially greater effort required to resolve edgematching problems. It must be remembered that the scale in which the maps were originally prepared affects the level of accuracy the survey draftsmen worked with. On paper, the discrepancies are not so obvious but in a mapping or GIS package with their capabilities for zooming in to very fine detail, the errors will become very obvious (to put it mildly). The problem is evident when joining say a map digitised from a 1 inch : 4 chain map with another smaller scale map of 1 inch : 80 feet. We have discovered that roads which are supposed to be the same width appear to be of different widths when these maps are joined together in the GIS. Generally, the ability of the draftsmen to accurately draw say a 20 feet reserve on a 1 inch : 4 chain map is lower compared to a 1 inch : 80 feet map. On top to this, shrinkage (or stretching) during printing to paper further aggravates the problem.

    Figure 5
    Errors Resulting from Accuracy of
    Different Map Scales

    Figure 5 shows errors arising from different map scale being joined together. The map sheet on the left was digitised from a 1 inch : 80 feet map while the one on the right was from a 1 inch : 40 feet map.


    wpe4.jpg (15731 bytes)

    Nevertheless, for cadastral map digitising, availability of map series is determined by what is produced and sold by the Survey Department. In less developed (rural) areas, the only maps available may be 1 inch : 4 chain in scale. Any scale larger than this is likely to be inaccurate and extremely difficult to digitise because digitisers do not have very high sensitivity. Narrow lots such as access or even terrace lots are likely to snap together. The nature of ad hoc subdivision by individual landowners has also resulted in tiny strips or pieces of land which requires ingenious solutions to digitise. For instance, in some situations, we found it impossible to set a workable snap distance and had to resort to overshooting intersections to avoid snapping onto the nearest vertex or node.

    But even at 1 inch : 4 chain, it is difficult to maintain a high level of accuracy especially for terrace house subdivision and narrow access reserves. It is therefore not surprising that the Survey Department has adopted the practice of only showing the outline of the terrace housing block without showing the individual lots on the standard sheets. Only the relevant certified plan (CP) number of the subdivision is recorded on the standard sheet. Ideally, a new map series on a smaller scale should have been created for such areas to show all the new subdivision on a standard sheet rather reference to numerous CPs

    It has been suggested that the amount of effort is the same no matter what scale is used for digitising since the number of lots is the same regardless of map scale. This perception is incorrect because smaller scale maps means greater number of maps sheets which must be prepared for digitising and subsequently edgematched and joined together. To retrieve each of the CPs to capture the individual subdivisions not shown on the standard sheets will involve a substantial increase in the amount of time and manpower resources.

    A point to note about the map series is that regardless of scale, each of the maps cover a standard paper area of 35 inches by 25 inches. This has resulted in the non-coincidence of the different map series, that is, maps of small scale do not fit exactly in the area of coverage of the large scale series. To be more specific, smaller scale maps (say 1 inch : 40 feet) would overflow from one standard sheet into the adjoining sheet of a larger scale map (say 1 inch : 4 chain). Where the map series of smaller scale terminates, narrow strips of land are maintained in maps of the larger scale. Obviously, maintaining these multiple series of maps requires a tremendous amount of effort. In an environment of rapid urban growth, it not surprising that the manual method employed by the Survey Department is hard-pressed to be current.

    In the GIS, map features are stored in real-world units (feet, meters, longitude/latitude), that is, there is no scale in the GIS database. The problems related to map scale discussed above will not be irrelevant once the entire cadastral database is captured in a GIS (whether through digitising, coordinate geometry or some other method). Scale will however be important for display purposes and hardcopy map production.



  7. System of numbering lots
  8. The Survey Department has implemented a coding system for identifying land parcels in the digital database comprising the administrative area code and 6-digit lot number. The administrative area code is identified by separate codes for State, District, Mukim or Towns, and Town Sections. The combination of these codes will produce an 15-digit code (actually, 15 characters because they are stored as strings) which uniquely identifies every piece of land in Malaysia and can be used to link various data sources based on lot numbers. The applications developed by this pilot project for the Town Planning Department, MPPP adopted the same system and codes used by the Survey Department, Pulau Pinang.

    However, this system is only being implemented by the Survey Department as and when new lots are created and captured into their Mini-CALS (Computer Aided Land Survey) system. Hence, new lots on the standard sheets will reflect the 6-digit codes for lot numbers with leading zeros as required (e.g. 000101, 003478). Old lots still maintain their previous system of numbering (e.g. 101, 3478). To maintain consistency, the USM Team converted all the old lot numbers to the new system by adding leading zeros to form 6-digit lot numbers (codes). One complication is that a third system of numbering is still maintained in the standard sheets even though it had been abolished for quite some time. This third system used a combination of numbers and Roman numerals (e.g. 101ix, 1267xiiv) which occasionally caused some difficulties when they exceeded the 6-digit length of the field.

    Even though numbers are used to code the administrative areas and lot numbers, the use of leading zeros means that the fields must be defined as text fields (note that JPBD/PEGIS used numeric fields instead without leading zeros in all the fields). It is important to maintain consistency in field definitions not only within the database but across different databases. Different field types will cause problems when linking data cross different databases based on a combination of these fields. It is also very important to remember that when concatenating text fields we must use the appropriate operator (i.e. "&" instead of "+" in MS Access; "++" instead of "+" in ArcView). Using the "+" operator (i.e. mathematical addition) to concatenate text fields may seem to produce the same results visually, but internally the results which are stored in the database will cause complications such as data miss-match or error messages of incompatible data types during links or inability to located the data during search operations. In addition, when concatenating these fields, care must be taken to remove blank spaces (using trim functions) and to convert numeric values to text (using string conversion functions) as necessary. For example, JPBD/PEGIS’s Georgetown layer uses numeric codes for the administrative area codes and lot numbers. These were converted to text values and padded with leading zeros to be compatible with the maps created by the USM Team. Fortunately, this can be accomplished quickly with the CALCULATE operation


  9. Non-closure of roads, access and rivers
  10. In all the standard sheets, road reserves, access and rivers are not defined as separate land parcels with their own lot numbers (except in the olden days when roads could be held in private ownership in which case separate titles were issued). Hence there is no closure of polygons and all the roads, access and rivers become part of the external polygon (the universe).

    The major implication is that these features cannot be identified and retrieved from the GIS database because they do not exist as separate polygons. In others words, it will not be possible to say search the database by road name and display the lots in its vicinity. Another implication is the inability to differentiate these features for cartographic clarity. By extracting roads, access and rivers in separate layers, we can utilise colours to create visual references for map-presentation and map-reading.

    In the digital cadastral map of Penang Island, we have created separate polygons for these features with general labels only (i.e. jalan, rezab and sungei) intended mainly for cartographic purposes. A separate road layer has also been created for the portion of Georgetown created by JPBD/PEGIS (which did not adopt a policy to create polygons for these features).


  11. PC Arc/Info Limitation
  12. The most serious problem of pc Arc/Info encountered during this project is the "not more than 5000 arcs in a polygon" limitation. This limit means that any particular polygon must not comprise more than 5000 arcs. By itself, the limit is unlikely to be breached. However, the limit also applies to the external polygon (or the universe). In the case of the Penang Island Cadastral Map, it means the entire coast line around the island will be treated as part of the external polygon. In this situation, the 5000 arc limit will definitely be exceeded causing the processing to bail out. The work-around is to create a box or several boxes around the Island to reduce the number of arcs associated with the external polygon. The boxes stick out like sore thumbs and cannot be removed as long as pc Arc/Info remains the GIS engine.

    A less obvious case related to this limit concerns road and river reserves themselves. If all the road reserves for Penang Island are given the same label (say "jalan") and mapjoined and then dissolved using this particular field with the intention to create a single polygon showing the connectivity of this feature, an error in processing will occur causing the operation to terminate and bail out. The limit that "a polygon must not exceed 5000 arcs" had been breached but the error message will not indicate the source of error. The unsuspecting user may end up spending a considerable amount of time and effort looking for "that polygon" which has more than 5000 arcs. In a large database like the Penang Island Digital Cadastral the uninitiated may not "see" the road polygon as even being a polygon. The solution to avoid the "bail out" is to define several separate roads polygons with unique labels (e.g. jalan1, jalan2).

    It is noted Workstation Arc/Info on Unix before version 7 also had a similar limitation (i.e. not more than 10000 arcs per polygon). In fact, JPBD and PEGIS encountered this problem when they were using an earlier version of Workstation Arc/Info and had to work around the limit by creating pseudo-polygons to enclose Georgetown.

    The savior for desktop GIS is ArcView 3.0’s shapefiles which has no such limitation. By converting to shapefiles, the separate road polygons can be merged to achieve the desired results. The pseudo-polygons around the island can also be deleted to preserve geometric and visual integrity.

    A second major limitation of pc Arc/Info occurs when using Arcedit on very large files such as the when all the map sheets for areas outside of Georgetown had been joined and dissolve. The software is unable to accommodate edits such as removing or adding arcs (it bails out of Arcedit). This happened even when using the Pentium Pro machine with 64 Mbytes of RAM and several hundred megabytes of free harddisk space. However, Arcedit is able to display large files and ARC is able to process the files in updates, joins and dissolve operations.

    A less serious problem with pc Arc/Info occurs during edgematching. When the maps are joined and become very large, subsequent edgematching operations may encounter problems. pc Arc/Info creates a box in which the edgematching will be carried and it automatically creates one equivalent to 1/12th the length of edge to be matched. When this becomes too large for the system to handle the solution is to specify the size of the box when issuing the edgematch command. Another problem encountered is when we tried to edgematch a small map to a very large map (e.g. a single standard 1 inch to 80 feet map of Georgetown to the rest of Penang Island). The operation bailed out but this can be overcome by using smaller sections of the map for the edgematch.

  13. Desktop Technology
  14. There are two hindrance to the embrace of GIS technology for the day-to-day operations of the planning office. Firstly, GIS technology is very expensive and secondly, the technology is perceived as being difficult to learn and implement. On the first point, a complete GIS setup with software, digitiser, large format printer (plotter) and a computer would cost about RM50,000 for a PC-based (desktop) configuration and RM120,000 or more for a Unix-based system. This may appear to be a small investment considering the immense potential and benefits from the use of IT but experience even in Malaysia have shown that hardware and software acquisition is insufficient for the successful integration of IT into the work culture. Which brings us to the second point about ease of use. Generally, GIS have become very user-friendly as far as the end-user is concerned (as exemplified by ArcView 3.0). Nevertheless, undertaking a GIS project from data-capture to customising the interface can be a daunting task which requires considerable resolve, perseverance and knowledge and understanding of GIS technology. Added to this perceived complexity about GIS is a phobia about computers in general. The more expensive the equipment the greater the fear by the user that his/her inexperience may crash or destroy the system. This fear must be overcome gradually through the integration of the technology into daily routines using affordable and easy to use system.

    The Town Planning Department, MPPP has taken the right step in this direction through the implementation of applications developed by this pilot project. A local area network with Pentium-based PCs using Windows 95 and Microsoft file-sharing capabilities has been implemented and has generated interest amongst the staff to learn and use the technology. A positive sign is that staff with completely no knowledge or experience in using computers have come forward to specifically ask to be taught how to use the system.

    Desktop GIS obviously has limitations in terms of speed and functionalities but advances already taking place will close the gap between the desktop and Unix-based workstations. In particular Windows NT with its promise of 64-bit technology and cluster processing will become the workhorse for a departmental GIS. ESRI has also promise greater compatibility with Microsoft products which will translate into greater ease of integration of various components, especially between a GIS and relational databases. Currently, the link between ArcView and MS Access is indirect through SQL Connect. Hopefully, ArcView will in future permit a direct link to MS Access tables.

    This pilot project has demonstrated that despite the limitations of the PC technology a GIS system can be successfully developed and implemented. Initially, the 486 PCs with limited RAM and harddisk space was a severe handicap especially during digitising. If the digitiser tablet was clicked too fast in succession, the PC would freeze as it is unable to respond fast enough. This resulted in lost work between safes and a lot of frustration in having to restart the process. Through deliberate spacing between clicks (delays) to provide time for response from the PC this problem was overcome and the digitising work could be efficiently carried out without disruption. Obviously, using a high-end PC such as the Pentium Pro or Pentium II with a large RAM and large swap space would dramatically improve performance. For most planning offices, there should be no difficulty in acquiring these high-end machines (ranging from RM4,000 to RM7,000 each). The price of software may be a little high for some departments. A copy of pc Arc/Info is about RM20,000 – RM25,000. An alternative is to acquire ArcView 3.0 which is less than RM4,000 per licensed copy and then progressively acquire other components as necessary.



  15. Trained Personnel

When the pilot project was started, the three technicians from the School of HBP had no previous exposure to GIS even though they were all familiar with computers and various software including CAD. But by nature of their training as technicians and the willingness to acquire new skills, they have been able to acquire the skills to undertake and complete the project to create the digital databases for Penang Island. The approach adopted for training the technicians was incremental. For instance, they first learned how to digitise, then cleaned, build, transform and so on as the project progressed. At each stage, they were given briefings to help them understand each of the steps. Having successfully completed creating the digital maps, the three technicians have acquired the necessary skills to undertake data capture using GIS. Obviously, at this stage of their development there is still a large body of knowledge related to the use or exploration of GIS data which they must acquire to be fully trained as GIS personnel.

The experience from this pilot project clearly shows that data capture in general and digitising in particular requires much dedication and meticulous attention to detail. The pride of seeing a job well-done has also been crucial especially when the data quality is not consistently high. In fact, one of the reasons the project has taken longer than anticipated is due to the need to resolve problems originating from the map manuscripts requiring double-checking, editing, redigitising and updating.




In the process of undertaking this pilot project, the USM Team has gained considerable experience and skills and recommends the following actions, follow-ups and precautions for future work for others who would like to start similar projects.

The assimilation of the IT work culture cannot be attainment overnight. It must be incrementally integrated into the workflow to permit workers to gain confidence in order to encourage them to explore and experiment with new and more effective and efficient ways to accomplished a task. The strategy we recommend is to start with affordable technology which are at the same time easy to use. I believe we have proven and will continue to endeavour to show that even the lower-end systems can be made to produce dramatic results. Hopefully, others will not have the painful teething problems we encountered after they have read this paper.

One question which comes to mind is whether we can rely entirely on ArcView, that is, without having to acquire pc Arc/Info. The answer is yes but as it now stands ArcView cannot perform some of the operations available in pc Arc/Info such as transformation and edgematching eventhough it has digitising capabilities. This deficiency can be overcome by using the Data Automation Kit available from ESRI.

A recommended configuration of the computer system is at least a Pentium Pro or Pentium II with 64 Megabytes of RAM or more and a large harddisk because it needs a lot swap space. A large screen is also useful (at least 15 inch monitor) for multiple windows opened at the same time and large display area. An A0-sized digitiser is also a good investment even though some may recommend scanning and on-screen digitising as an alternative. A raster-based plotter (also known as large format printers) is preferred over pen and pencil plotters because the latter are good mainly for line work. Maps or cartographic products on the other hand require solid rendering or hatching for thematic mapping. A complete package could be acquired for between RM40,000 to RM50,000 depending on the number of GIS licenses needed. This compares favourably with the amount of RM120,000 or more needed to set up a GIS system on workstations.

On the question of staff training, we recommend that it be a continuous process in which the management identifies a specific project or product to be generated from the "training". In this way, the staff is continuously using the technology rather then spending one-week or so on intensive training but after two months has forgotten how to even get started. New techniques will be acquired when needed and at the end of the project a complete product is added to the department's library of products. Experience have shown that training is not effective if the skills are not immediately integrated into the tasks performed daily by the staff.

In conclusion, I urge that the "low-end" of the market be given greater emphasis in terms of intensified development and incorporation of more and greater functionalities for desktop GIS. We must first build up user confidence and from there prod the GIS novices towards becoming power users.




Lee Lik Meng, et al, 1996. "Development Of A GIS-Based Planning System For The Municipal Council Of Penang Island". 2nd Annual GIS Asia Pacific Conference, 18-20 September 1996, Putra World Trade Centre, Kuala Lumpur in Proceedings of Conference (Disk 5)

Lee Lik Meng, et al, 1996. User Needs Report. Pilot Project for the Development of a GIS-based Planning System for the Municipal Council of Penang Island. November 1996



This paper is adapted from a report entitled "Digital Cadastral Map of Penang Island" (June 1997) generated by the USM Team as documentation for the Pilot Project with MPPP.
1 chain is equivalent to 66 feet
RM1.00 is equivalent to about US$2.90 (current rate)

Dr. Lee Lik Meng
Email :
Homepage :
31 August, 1997

This article is hosted on the HBP IT Support Web

Copyright by Lee Lik Meng, 1998
Last revised : Monday, 10 November 2003 12:33 PM